Skip to main content

Full text of "Quantum Algebraic Topology and Symmetry: vols. I-III (2011)"

See other formats


Quantum Algebra and 
Symmetry 

Quantum Algebraic Topology, Quantum 
Field Theories and Higher Dimensional 
Algebra 



PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. 
PDF generated at: Sun, 12 Dec 2010 01:24:24 UTC 



Contents 



Articles 

Copyright©2010 by LC. Baianu ;v5, December 10, 2010 
I.C.Baianu,Ph.D., M.Inst.P, Editor (with listed contributors) 



Quantum Theories 3 

Quantum mechanics 3 

S chrodinger equation 1 7 

Dirac equation 25 

Klein-Gordon equation 36 

Einstein-Maxwell-Dirac equations 40 

Rigged Hilbert space 41 

Quantum inverse scattering method 43 

Quasi-Hopf algebra 43 

Quasitriangular Hopf algebra 44 

Ribbon Hopf algebra 45 

Quasi-triangular Quasi-Hopf algebra 46 

Grassmann algebra 47 

Supergroup 61 

Superalgebra 62 

Supergravity 65 

Quantum statistical mechanics 7 1 

Quantum thermodynamics 73 

Supertheory 74 

Quantum Algebra 81 

Quantum algebra 8 1 

Lie algebra 81 

Lie group 86 

Hopf algebra 96 

Quantum group 101 

Affine quantum group 108 

Affine Lie algebra 109 

Quantum affine algebra 111 

Operator algebra 112 



Clifford algebra 

Distributions 

Hilbert space 

Von Neumann algebra 

C* -algebra 

Kac-Moody algebra 

Spectral theory 

Quantum Field Theory, SUSY, Quantum 
Algebraic Topology 

Quantum electrodynamics 
Quantum field theory 
Scalar field theory 
Yang-Mills theory 
Yangian 

Quantum spacetime 
Quantum gauge theory 
Standard Model 

Topological quantum field theory 
Quantum Chromodynamics 
Quantum Geometry 
Loop Quantum Gravity 
Quantum Algebraic Topology 
Commutativity 

Noncommutative quantum field theory 
Noncommutative standard model 
Nonabelian Gauge Theory 
List of quantum field theories 
Noncommutative geometry 
Quantum gravity 
String Theories 
Superstring Theories 

Group Representations and Symmetry 

Symmetry 
Representations 

Representations of the symmetric group 
Representations of a finite group 



113 
123 
134 
161 
171 
176 
178 

Geometry and Quantum 

185 

185 
197 
211 
218 
223 
225 
230 
230 
244 
247 
255 
256 
262 
268 
273 
274 
276 
286 
287 
291 
300 
318 

324 

324 
338 
340 
342 



Representations of finite groups of Lie type 349 

Representations of Lie groups 355 

Representations of Lie algebras 357 

Representations of the Poincare group 359 

Representation theory of the Lorentz group 360 

Double group 364 

Noether's Theorem 371 

Goldstone theorem 384 

Stone- von Neumann theorem 387 

Peter- Weyl theorem 392 

Supersymmetry 394 

In variance mechanics 401 

Wigner's theorem 405 

Wigner's classification 405 

Quantum hydrodynamics 406 

Quantum magnetodynamics 407 

Nonabelian group 419 

Higher Dimensional Algebra 42 o 

Algebraic Topology 420 

Category Theory 424 

Double Groupoid 43 1 

Higher dimensional algebra 433 

Higher category theory 437 

Duality (mathematics) 439 

Anabelian geometry 447 

Noncommutative geometry 448 

Quantum Logics and Quantum Computers 452 

Multi-valued logic 452 

Quantum information 454 

Quantum logic 456 

Quantum computer 463 

Algebraic Logic 472 

Boolean logic 472 

Algebraic logic 478 

Lukasiewicz logic 480 



Intuitionistic logic 482 

Mathematical logic 487 

Hey ting arithmetic 501 

Symbolic logic 501 

Metatheory 502 

Metalogic 503 

Quantum Biographies 506 

Niels Bohr 506 

Max Planck 514 

Louis de Broglie 523 

The Lord Rutherford of Nelson 526 

Albert Einstein 533 

Erwin Schrodinger 562 

John von Neumann 568 

John van Vleck 580 

Paul Dirac 583 

Werner Heisenberg 592 

Albert Einstein 615 

Max Born 644 

Eugene Wigner 655 

Alexandru Proca 661 

Hideki Yukawa 663 

Neville Mott 665 

Paul W. Anderson 667 

George Karreman 670 

Alberte Pullman 672 

Richard Feynman 673 

Murray Gell-Mann 689 

Stephen Weinberg 694 

Anatole Abragam 699 

§tefan Procopiu 701 

Ionel Solomon 703 

References 

Article Sources and Contributors 706 

Image Sources, Licenses and Contributors 721 



License 



1 



Copyright©2010 by I.C. Baianu ;v5, 
December 10, 2010 



.C.Baianu,Ph.D., M.Inst.P, Editor (with 
listed contributors) 



3 



Quantum Theories 



Quantum mechanics 



Quantum mechanics, also known as quantum 
physics or quantum theory, is a branch of 
physics providing a mathematical description of 
much of the dual particle-like and wave-like 
behavior and interactions of energy and matter. It 
departs from classical mechanics primarily at the 
atomic and subatomic scales, the so-called 
quantum realm. In advanced topics of quantum 
mechanics, some of these behaviors are 
macroscopic and only emerge at very low or very 
high energies or temperatures. The name, coined 
by Max Planck, derives from the observation that 
some physical quantities can be changed only by 
discrete amounts, or quanta, as multiples of the 
Planck constant, rather than being capable of 
varying continuously or by any arbitrary amount. 
For example, the angular momentum, or more 
generally the action, of an electron bound into an 
atom or molecule is quantized. While an unbound 
electron does not exhibit quantized energy levels, 
an electron bound in an atomic orbital has 
quantized values of angular momentum. In the 
context of quantum mechanics, the wave-particle 
duality of energy and matter and the uncertainty 
principle provide a unified view of the behavior 
of photons, electrons and other atomic- scale 
objects. 



s p 

• 


d 

i 

2 





Fig. 1: Probability densities corresponding to the wavefunctions of an 
electron in a hydrogen atom possessing definite energy levels (increasing 

from the top of the image to the bottom: n = 1, 2, 3, ...) and angular 
momentum (increasing across from left to right: s, p, d, ...). Brighter areas 

correspond to higher probability density in a position measurement. 
Wavefunctions like these are directly comparable to Chladni's figures of 
acoustic modes of vibration in classical physics and are indeed modes of 
oscillation as well: they possess a sharp energy and thus a keen frequency. 
The angular momentum and energy are quantized, and only take on discrete 
values like those shown (as is the case for resonant frequencies in 
acoustics). 



The mathematical formulations of quantum mechanics are abstract. Similarly, the implications are often 
non-intuitive in terms of classic physics. The centerpiece of the mathematical system is the wavefunction. The 
wavefunction is a mathematical function providing information about the probability amplitude of position and 
momentum of a particle. Mathematical manipulations of the wavefunction usually involve the bra-ket notation, 
which requires an understanding of complex numbers and linear functionals. The wavefunction treats the object as a 
quantum harmonic oscillator and the mathematics is akin to that of acoustic resonance. Many of the results of 
quantum mechanics do not have models that are easily visualized in terms of classical mechanics; for instance, the 
ground state in the quantum mechanical model is a non-zero energy state that is the lowest permitted energy state of 
a system, rather than a more traditional system that is thought of as simply being at rest with zero kinetic energy. 

Historically, the earliest versions of quantum mechanics were formulated in the first decade of the 20th century at 
around the same time as the atomic theory and the corpuscular theory of light as updated by Einstein first came to be 
widely accepted as scientific fact; these latter theories can be viewed as quantum theories of matter and 



Quantum mechanics 



4 



electromagnetic radiation. Quantum theory was significantly reformulated in the mid- 1920s away from the old 
quantum theory towards the quantum mechanics formulated by Werner Heisenberg, Max Born, Wolfgang Pauli and 
their associates, accompanied by the acceptance of the Copenhagen interpretation of Niels Bohr. By 1930, quantum 
mechanics had been further unified and formalized by the work of Paul Dirac and John von Neumann, with a greater 
emphasis placed on measurement in quantum mechanics, the statistical nature of our knowledge of reality and 
philosophical speculation about the role of the observer. Quantum mechanics has since branched out into almost 
every aspect of 20th century physics and other disciplines such as quantum chemistry, quantum electronics, quantum 
optics and quantum information science. Much 19th century physics has been re-evaluated as the classical limit of 
quantum mechanics, and its more advanced developments in terms of quantum field theory, string theory, and 
speculative quantum gravity theories. 

History 

The history of quantum mechanics dates back to the 1838 discovery of cathode rays by Michael Faraday. This was 
followed by the 1859 statement of the black body radiation problem by Gustav Kirchhoff, the 1877 suggestion by 
Ludwig Boltzmann that the energy states of a physical system can be discrete, and the 1900 quantum hypothesis of 
Max Planck. tl] Planck's hypothesis that energy is radiated and absorbed in discrete "quanta", or "energy elements", 
enabled the correct derivation of the observed patterns of black body radiation. According to Planck, each energy 
element E is proportional to its frequency v: 

E = hv 

where h is Planck's action constant. Planck cautiously insisted that this was simply an aspect of the processes of 
absorption and emission of radiation and had nothing to do with the physical reality of the radiation itself. 
However, in 1905 Albert Einstein interpreted Planck's quantum hypothesis realistically and used it to explain the 
photoelectric effect, in which shining light on certain materials can eject electrons from the material. Einstein 
postulated that light itself consists of individual quanta of energy, later called photons. 1 J 

The foundations of quantum mechanics were established during the first half of the twentieth century by Niels Bohr, 
Werner Heisenberg, Max Planck, Louis de Broglie, Albert Einstein, Erwin Schrodinger, Max Born, John von 
Neumann, Paul Dirac, Wolfgang Pauli, David Hilbert, and others. In the mid- 1920s, developments in quantum 
mechanics quickly led to its becoming the standard formulation for atomic physics. In the summer of 1925, Bohr and 
Heisenberg published results that closed the "Old Quantum Theory". Out of deference to their dual state as particles, 
light quanta came to be called photons (1926). From Einstein's simple postulation was born a flurry of debating, 
theorizing and testing. Thus, the entire field of quantum physics emerged leading to its wider acceptance at the Fifth 
Solvay Conference in 1927. 

The other exemplar that led to quantum mechanics was the study of electromagnetic waves such as light. When it 
was found in 1900 by Max Planck that the energy of waves could be described as consisting of small packets or 
quanta, Albert Einstein further developed this idea to show that an electromagnetic wave such as light could be 
described by a particle called the photon with a discrete energy dependent on its frequency. This led to a theory of 
unity between subatomic particles and electromagnetic waves called wave-particle duality in which particles and 
waves were neither one nor the other, but had certain properties of both. While quantum mechanics describes the 
world of the very small, it also is needed to explain certain macroscopic quantum systems such as superconductors 
and superfluids. 

The word quantum derives from Latin meaning "how great" or "how much" J 4] In quantum mechanics, it refers to a 
discrete unit that quantum theory assigns to certain physical quantities, such as the energy of an atom at rest (see 
Figure 1). The discovery that particles are discrete packets of energy with wave-like properties led to the branch of 
physics that deals with atomic and subatomic systems which is today called quantum mechanics. It is the underlying 
mathematical framework of many fields of physics and chemistry, including condensed matter physics, solid-state 
physics, atomic physics, molecular physics, computational physics, computational chemistry, quantum chemistry, 



Quantum mechanics 



5 



particle physics, nuclear chemistry, and nuclear physics. Some fundamental aspects of the theory are still actively 
studied.^ Quantum mechanics is essential to understand the behavior of systems at atomic length scales and smaller. 
For example, if classical mechanics governed the workings of an atom, electrons would rapidly travel towards and 
collide with the nucleus, making stable atoms impossible. However, in the natural world the electrons normally 
remain in an uncertain, non-deterministic "smeared" (wave-particle wave function) orbital path around or through 
the nucleus, defying classical electromagnetism. Quantum mechanics was initially developed to provide a better 
explanation of the atom, especially the spectra of light emitted by different atomic species. The quantum theory of 
the atom was developed as an explanation for the electron's staying in its orbital, which could not be explained by 
Newton's laws of motion and by Maxwell's laws of classical electromagnetism. Broadly speaking, quantum 
mechanics incorporates four classes of phenomena for which classical physics cannot account: 

• The quantization (discretization) of certain physical quantities 

• wave-particle duality 

• uncertainty principle 

• quantum entanglement 



Mathematical formulations 

In the mathematically rigorous formulation of quantum mechanics developed by Paul Dirac^ and John von 

rm 

Neumann, 1 the possible states of a quantum mechanical system are represented by unit vectors (called "state 
vectors"). Formally, these reside in a complex separable Hilbert space (variously called the "state space" or the 
"associated Hilbert space" of the system) well defined up to a complex number of norm 1 (the phase factor). In other 
words, the possible states are points in the projectivization of a Hilbert space, usually called the complex projective 
space. The exact nature of this Hilbert space is dependent on the system; for example, the state space for position and 
momentum states is the space of square-integrable functions, while the state space for the spin of a single proton is 
just the product of two complex planes. Each observable is represented by a maximally Hermitian (precisely: by a 
self-adjoint) linear operator acting on the state space. Each eigenstate of an observable corresponds to an eigenvector 
of the operator, and the associated eigenvalue corresponds to the value of the observable in that eigenstate. If the 
operator's spectrum is discrete, the observable can only attain those discrete eigenvalues. 

In the formalism of quantum mechanics, the state of a system at a given time is described by a complex wave 
function, also referred to as state vector in a complex vector space J 10 ^ This abstract mathematical object allows for 
the calculation of probabilities of outcomes of concrete experiments. For example, it allows one to compute the 
probability of finding an electron in a particular region around the nucleus at a particular time. Contrary to classical 
mechanics, one can never make simultaneous predictions of conjugate variables, such as position and momentum, 
with accuracy. For instance, electrons may be considered to be located somewhere within a region of space, but with 
their exact positions being unknown. Contours of constant probability, often referred to as "clouds", may be drawn 
around the nucleus of an atom to conceptualize where the electron might be located with the most probability. 
Heisenberg's uncertainty principle quantifies the inability to precisely locate the particle given its conjugate 
momentum J- U ^ 

As the result of a measurement, the wave function containing the probability information for a system collapses from 
a given initial state to a particular eigenstate of the observable. The possible results of a measurement are the 
eigenvalues of the operator representing the observable — which explains the choice of Hermitian operators, for 
which all the eigenvalues are real. We can find the probability distribution of an observable in a given state by 
computing the spectral decomposition of the corresponding operator. Heisenberg's uncertainty principle is 
represented by the statement that the operators corresponding to certain observables do not commute. 

The probabilistic nature of quantum mechanics thus stems from the act of measurement. This is one of the most 
difficult aspects of quantum systems to understand. It was the central topic in the famous Bohr-Einstein debates, in 
which the two scientists attempted to clarify these fundamental principles by way of thought experiments. In the 



Quantum mechanics 



6 



decades after the formulation of quantum mechanics, the question of what constitutes a "measurement" has been 

extensively studied. Interpretations of quantum mechanics have been formulated to do away with the concept of 

"wavefunction collapse"; see, for example, the relative state interpretation. The basic idea is that when a quantum 

system interacts with a measuring apparatus, their respective wavefunctions become entangled, so that the original 

quantum system ceases to exist as an independent entity. For details, see the article on measurement in quantum 

mechanics. Generally, quantum mechanics does not assign definite values to observables. Instead, it makes 

predictions using probability distributions; that is, the probability of obtaining possible outcomes from measuring an 

ri3i 

observable. Often these results are skewed by many causes, such as dense probability clouds or quantum state 
nuclear attraction J ^ Naturally, these probabilities will depend on the quantum state at the "instant" of the 
measurement. Hence, uncertainty is involved in the value. There are, however, certain states that are associated with 
a definite value of a particular observable. These are known as eigenstates of the observable ("eigen" can be 
translated from German as inherent or as a characteristic).^ 

In the everyday world, it is natural and intuitive to think of everything (every observable) as being in an eigenstate. 

Everything appears to have a definite position, a definite momentum, a definite energy, and a definite time of 

occurrence. However, quantum mechanics does not pinpoint the exact values of a particle for its position and 

momentum (since they are conjugate pairs) or its energy and time (since they too are conjugate pairs); rather, it only 

provides a range of probabilities of where that particle might be given its momentum and momentum probability. 

Therefore, it is helpful to use different words to describe states having uncertain values and states having definite 

values (eigenstate). Usually, a system will not be in an eigenstate of the observable we are interested in. However, if 

one measures the observable, the wavefunction will instantaneously be an eigenstate (or generalized eigenstate) of 

ri7i 

that observable. This process is known as wavefunction collapse, a debatable process. It involves expanding the 

system under study to include the measurement device. If one knows the corresponding wave function at the instant 

before the measurement, one will be able to compute the probability of collapsing into each of the possible 

eigenstates. For example, the free particle in the previous example will usually have a wavefunction that is a wave 

packet centered around some mean position x , neither an eigenstate of position nor of momentum. When one 

ri2i 

measures the position of the particle, it is impossible to predict with certainty the result. It is probable, but not 

certain, that it will be near x , where the amplitude of the wave function is large. After the measurement is 

ri8i 

performed, having obtained some result x, the wave function collapses into a position eigenstate centered at x. 

The time evolution of a quantum state is described by the Schrodinger equation, in which the Hamiltonian, the 

operator corresponding to the total energy of the system, generates time evolution. The time evolution of wave 

functions is deterministic in the sense that, given a wavefunction at an initial time, it makes a definite prediction of 

ri9i 

what the wavefunction will be at any later time. 

During a measurement, on the other hand, the change of the wavefunction into another one is not deterministic, but 
rather unpredictable, i.e., random. A time-evolution simulation can be seen hereJ 20 ^ ^ Wave functions can change 
as time progresses. An equation known as the Schrodinger equation describes how wave functions change in time, a 
role similar to Newton's second law in classical mechanics. The Schrodinger equation, applied to the aforementioned 
example of the free particle, predicts that the center of a wave packet will move through space at a constant velocity, 
like a classical particle with no forces acting on it. However, the wave packet will also spread out as time progresses, 
which means that the position becomes more uncertain. This also has the effect of turning position eigenstates 
(which can be thought of as infinitely sharp wave packets) into broadened wave packets that are no longer position 
eigenstates P 2 ^ 

Some wave functions produce probability distributions that are constant, or independent of time, such as when in a 
stationary state of constant energy, time drops out of the absolute square of the wave function. Many systems that are 
treated dynamically in classical mechanics are described by such "static" wave functions. For example, a single 
electron in an unexcited atom is pictured classically as a particle moving in a circular trajectory around the atomic 
nucleus, whereas in quantum mechanics it is described by a static, spherically symmetric wavefunction surrounding 



Quantum mechanics 



7 



the nucleus (Fig. 1). (Note that only the lowest angular momentum states, labeled s, are spherically symmetric). 

The Schrodinger equation acts on the entire probability amplitude, not merely its absolute value. Whereas the 
absolute value of the probability amplitude encodes information about probabilities, its phase encodes information 
about the interference between quantum states. This gives rise to the wave-like behavior of quantum states. It turns 
out that analytic solutions of Schrodinger's equation are only available for a small number of model Hamiltonians, of 
which the quantum harmonic oscillator, the particle in a box, the hydrogen molecular ion and the hydrogen atom are 
the most important representatives. Even the helium atom, which contains just one more electron than hydrogen, 
defies all attempts at a fully analytic treatment. There exist several techniques for generating approximate solutions. 
For instance, in the method known as perturbation theory one uses the analytic results for a simple quantum 
mechanical model to generate results for a more complicated model related to the simple model by, for example, the 
addition of a weak potential energy. Another method is the "semi-classical equation of motion" approach, which 
applies to systems for which quantum mechanics produces weak deviations from classical behavior. The deviations 
can be calculated based on the classical motion. This approach is important for the field of quantum chaos. 

There are numerous mathematically equivalent formulations of quantum mechanics. One of the oldest and most 
commonly used formulations is the transformation theory proposed by Cambridge theoretical physicist Paul Dirac, 
which unifies and generalizes the two earliest formulations of quantum mechanics, matrix mechanics (invented by 
Werner Heisenberg)^ ^ and wave mechanics (invented by Erwin Schrodinger).^ 26 ^ In this formulation, the 
instantaneous state of a quantum system encodes the probabilities of its measurable properties, or "observables". 

Examples of observables include energy, position, momentum, and angular momentum. Observables can be either 

i"27] 

continuous (e.g., the position of a particle) or discrete (e.g., the energy of an electron bound to a hydrogen atom). 
An alternative formulation of quantum mechanics is Feynman's path integral formulation, in which a 
quantum-mechanical amplitude is considered as a sum over histories between initial and final states; this is the 
quantum-mechanical counterpart of action principles in classical mechanics. 



Interactions with other scientific theories 

The fundamental rules of quantum mechanics are very deep. They assert that the state space of a system is a Hilbert 
space and the observables are Hermitian operators acting on that space, but do not tell us which Hilbert space or 
which operators, or if it even exists. These must be chosen appropriately in order to obtain a quantitative description 
of a quantum system. An important guide for making these choices is the correspondence principle, which states that 
the predictions of quantum mechanics reduce to those of classical physics when a system moves to higher energies 
or equivalently, larger quantum numbers. In other words, classical mechanics is simply a quantum mechanics of 
large systems. This "high energy" limit is known as the classical or correspondence limit. One can therefore start 
from an established classical model of a particular system, and attempt to guess the underlying quantum model that 
gives rise to the classical model in the correspondence limit. 

When quantum mechanics was originally formulated, it was applied to models whose correspondence limit was 
non-relativistic classical mechanics. For instance, the well-known model of the quantum harmonic oscillator uses an 
explicitly non-relativistic expression for the kinetic energy of the oscillator, and is thus a quantum version of the 
classical harmonic oscillator. 

Early attempts to merge quantum mechanics with special relativity involved the replacement of the Schrodinger 
equation with a covariant equation such as the Klein-Gordon equation or the Dirac equation. While these theories 
were successful in explaining many experimental results, they had certain unsatisfactory qualities stemming from 
their neglect of the relativistic creation and annihilation of particles. A fully relativistic quantum theory required the 
development of quantum field theory, which applies quantization to a field rather than a fixed set of particles. The 
first complete quantum field theory, quantum electrodynamics, provides a fully quantum description of the 
electromagnetic interaction. The full apparatus of quantum field theory is often unnecessary for describing 
electrodynamic systems. A simpler approach, one employed since the inception of quantum mechanics, is to treat 



Quantum mechanics 



8 



charged particles as quantum mechanical objects being acted on by a classical electromagnetic field. For example, 
the elementary quantum model of the hydrogen atom describes the electric field of the hydrogen atom using a 
classical - 47r e eQ Coulomb potential. This "semi-classical" approach fails if quantum fluctuations in the 

electromagnetic field play an important role, such as in the emission of photons by charged particles. Quantum field 
theories for the strong nuclear force and the weak nuclear force have been developed. The quantum field theory of 
the strong nuclear force is called quantum chromodynamics, and describes the interactions of the subnuclear 
particles: quarks and gluons. The weak nuclear force and the electromagnetic force were unified, in their quantized 
forms, into a single quantum field theory known as electroweak theory, by the physicists Abdus Salam, Sheldon 
Glashow and Steven Weinberg. These three men shared the Nobel Prize in Physics in 1979 for this work J 28 -' 
It has proven difficult to construct quantum models of gravity, the remaining fundamental force. Semi-classical 
approximations are workable, and have led to predictions such as Hawking radiation. However, the formulation of a 
complete theory of quantum gravity is hindered by apparent incompatibilities between general relativity, the most 
accurate theory of gravity currently known, and some of the fundamental assumptions of quantum theory. The 
resolution of these incompatibilities is an area of active research, and theories such as string theory are among the 
possible candidates for a future theory of quantum gravity. Classical mechanics has been extended into the complex 
domain and complex classical mechanics exhibits behaviours similar to quantum mechanics. 

Quantum mechanics and classical physics 

Predictions of quantum mechanics have been verified experimentally to a very high degree of accuracy. According 
to the correspondence principle between classical and quantum mechanics, all objects obey the laws of quantum 
mechanics, and classical mechanics is just an approximation for large systems (or a statistical quantum mechanics of 
a large collection of particles). The laws of classical mechanics thus follow from the laws of quantum mechanics as a 
statistical average at the limit of large systems or large quantum numbers P 0 ^ However, chaotic systems do not have 
good quantum numbers, and quantum chaos studies the relationship between classical and quantum descriptions in 
these systems. 

Quantum coherence is an essential difference between classical and quantum theories, and is illustrated by the 
Einstein-Podolsky-Rosen paradox. Quantum interference involves the addition of probability amplitudes, whereas 
when classical waves interfere there is an addition of intensities. For microscopic bodies, the extension of the system 
is much smaller than the coherence length, which gives rise to long-range entanglement and other nonlocal 
phenomena characteristic of quantum systems. 1 Quantum coherence is not typically evident at macroscopic scales, 
although an exception to this rule can occur at extremely low temperatures, when quantum behavior can manifest 
itself on more macroscopic scales (see Bose-Einstein condensate). This is in accordance with the following 
observations: 

• Many macroscopic properties of a classical system are a direct consequences of the quantum behavior of its parts. 
For example, the stability of bulk matter (which consists of atoms and molecules which would quickly collapse 
under electric forces alone), the rigidity of solids, and the mechanical, thermal, chemical, optical and magnetic 
properties of matter are all results of the interaction of electric charges under the rules of quantum mechanics. 1 J 

• While the seemingly exotic behavior of matter posited by quantum mechanics and relativity theory become more 
apparent when dealing with extremely fast-moving or extremely tiny particles, the laws of classical Newtonian 
physics remain accurate in predicting the behavior of large objects — of the order of the size of large molecules 
and bigger — at velocities much smaller than the velocity of light. 



Quantum mechanics 



9 



Relativity and quantum mechanics 

Main articles: Quantum gravity and Theory of everything 

Even with the defining postulates of both Einstein's theory of general relativity and quantum theory being 
indisputably supported by rigorous and repeated empirical evidence and while they do not directly contradict each 
other theoretically (at least with regard to primary claims), they are resistant to being incorporated within one 
cohesive model P 4 ^ 

Einstein himself is well known for rejecting some of the claims of quantum mechanics. While clearly contributing to 
the field, he did not accept the more philosophical consequences and interpretations of quantum mechanics, such as 
the lack of deterministic causality and the assertion that a single subatomic particle can occupy numerous areas of 
space at one time. He also was the first to notice some of the apparently exotic consequences of entanglement and 
used them to formulate the Einstein-Podolsky-Rosen paradox, in the hope of showing that quantum mechanics had 
unacceptable implications. This was 1935, but in 1964 it was shown by John Bell (see Bell inequality) that, although 
Einstein was correct in identifying seemingly paradoxical implications of quantum mechanical nonlocality, these 
implications could be experimentally tested. Alain Aspect's initial experiments in 1982, and many subsequent 
experiements since, have verified quantum entanglement. 

According to the paper of J. Bell and the Copenhagen interpretation (the common interpretation of quantum 
mechanics by physicists since 1927), and contrary to Einstein's ideas, quantum mechanics was not at the same time 

• a "realistic" theory 

• and a local theory. 

The Einstein-Podolsky-Rosen paradox shows in any case that there exist experiments by which one can measure the 
state of one particle and instantaneously change the state of its entangled partner, although the two particles can be 
an arbitrary distance apart; however, this effect does not violate causality, since no transfer of information happens. 
Quantum entanglement is at the basis of quantum cryptography, with high-security commercial applications in 
banking and government. 

Gravity is negligible in many areas of particle physics, so that unification between general relativity and quantum 
mechanics is not an urgent issue in those applications. However, the lack of a correct theory of quantum gravity is an 
important issue in cosmology and physicists' search for an elegant "theory of everything". Thus, resolving the 
inconsistencies between both theories has been a major goal of twentieth- and twenty-first-century physics. Many 
prominent physicists, including Stephen Hawking, have labored in the attempt to discover a theory underlying 
everything, combining not only different models of subatomic physics, but also deriving the universe's four 
forces — the strong force, electromagnetism, weak force, and gravity — from a single force or phenomenon. One of 
the leaders in this field is Edward Witten, a theoretical physicist who formulated the groundbreaking M-theory, 
which is an attempt at describing the supersymmetrical based string theory. 

Attempts at a unified field theory 

As of 2010 the quest for unifying the fundamental forces through quantum mechanics is still ongoing. Quantum 

electrodynamics (or "quantum electromagnetism"), which is currently (in the perturbative regime at least) the most 

accurately tested physical theory, has been successfully merged with the weak nuclear force into the electro weak 

force and work is currently being done to merge the electro weak and strong force into the electrostrong force. 

Current predictions state that at around 10 14 GeV the three aforementioned forces are fused into a single unified 

field, ^ Beyond this "grand unification," it is speculated that it may be possible to merge gravity with the other 

19 

three gauge symmetries, expected to occur at roughly 10 GeV. However — and while special relativity is 
parsimoniously incorporated into quantum electrodynamics — the expanded general relativity, currently the best 
theory describing the gravitation force, has not been fully incorporated into quantum theory. 



Quantum mechanics 



10 



Philosophical implications 

Since its inception, the many counter-intuitive results of quantum mechanics have provoked strong philosophical 
debate and many interpretations. Even fundamental issues such as Max Born's basic rules concerning probability 
amplitudes and probability distributions took decades to be appreciated. 

Richard Feynman said, "I think I can safely say that nobody understands quantum mechanics.' 

The Copenhagen interpretation, due largely to the Danish theoretical physicist Niels Bohr, is the interpretation of the 
quantum mechanical formalism most widely accepted amongst physicists. According to it, the probabilistic nature of 
quantum mechanics is not a temporary feature which will eventually be replaced by a deterministic theory, but 
instead must be considered to be a final renunciation of the classical ideal of causality. In this interpretation, it is 
believed that any well-defined application of the quantum mechanical formalism must always make reference to the 
experimental arrangement, due to the complementarity nature of evidence obtained under different experimental 
situations. 

Albert Einstein, himself one of the founders of quantum theory, disliked this loss of determinism in measurement. 
(This dislike is the source of his famous quote, "God does not play dice with the universe.") Einstein held that there 
should be a local hidden variable theory underlying quantum mechanics and that, consequently, the present theory 
was incomplete. He produced a series of objections to the theory, the most famous of which has become known as 
the Einstein-Podolsky-Rosen paradox. John Bell showed that the EPR paradox led to experimentally testable 
differences between quantum mechanics and local realistic theories. Experiments have been performed confirming 
the accuracy of quantum mechanics, thus demonstrating that the physical world cannot be described by local realistic 

T381 

theories. The Bohr-Einstein debates provide a vibrant critique of the Copenhagen Interpretation from an 
epistemological point of view. 

The Everett many-worlds interpretation, formulated in 1956, holds that all the possibilities described by quantum 
theory simultaneously occur in a multiverse composed of mostly independent parallel universes P 9 ^ This is not 
accomplished by introducing some new axiom to quantum mechanics, but on the contrary by removing the axiom of 
the collapse of the wave packet: All the possible consistent states of the measured system and the measuring 
apparatus (including the observer) are present in a real physical (not just formally mathematical, as in other 
interpretations) quantum superposition. Such a superposition of consistent state combinations of different systems is 
called an entangled state. While the multiverse is deterministic, we perceive non-deterministic behavior governed by 
probabilities, because we can observe only the universe, i.e. the consistent state contribution to the mentioned 
superposition, we inhabit. Everett's interpretation is perfectly consistent with John Bell's experiments and makes 
them intuitively understandable. However, according to the theory of quantum decoherence, the parallel universes 
will never be accessible to us. This inaccessibility can be understood as follows: Once a measurement is done, the 
measured system becomes entangled with both the physicist who measured it and a huge number of other particles, 
some of which are photons flying away towards the other end of the universe; in order to prove that the wave 
function did not collapse one would have to bring all these particles back and measure them again, together with the 
system that was measured originally. This is completely impractical, but even if one could theoretically do this, it 
would destroy any evidence that the original measurement took place (including the physicist's memory). 



Quantum mechanics 



11 



Applications 

Quantum mechanics had enormous success in explaining many of the features of our world. The individual 
behaviour of the subatomic particles that make up all forms of matter — electrons, protons, neutrons, photons and 
others — can often only be satisfactorily described using quantum mechanics. Quantum mechanics has strongly 
influenced string theory, a candidate for a theory of everything (see reductionism) and the multi verse hypothesis. 

Quantum mechanics is important for understanding how individual atoms combine covalently to form chemicals or 
molecules. The application of quantum mechanics to chemistry is known as quantum chemistry. (Relativistic) 
quantum mechanics can in principle mathematically describe most of chemistry. Quantum mechanics can provide 
quantitative insight into ionic and covalent bonding processes by explicitly showing which molecules are 
energetically favorable to which others, and by approximately how muchJ 40 ^ Most of the calculations performed in 
computational chemistry rely on quantum mechanics J 4 ^ 

Much of modern technology operates 
at a scale where quantum effects are 
significant. Examples include the laser, 
the transistor (and thus the microchip), 
the electron microscope, and magnetic 
resonance imaging. The study of 
semiconductors led to the invention of 
the diode and the transistor, which are 
indispensable for modern electronics. 

Researchers are currently seeking 
robust methods of directly 
manipulating quantum states. Efforts 
are being made to develop quantum 
cryptography, which will allow 
guaranteed secure transmission of 
information. A more distant goal is the 
development of quantum computers, 
which are expected to perform certain 
computational tasks exponentially 
faster than classical computers. Another active research topic is quantum teleportation, which deals with techniques 
to transmit quantum information over arbitrary distances. 

Quantum tunneling is vital in many devices, even in the simple light switch, as otherwise the electrons in the electric 
current could not penetrate the potential barrier made up of a layer of oxide. Flash memory chips found in USB 
drives use quantum tunneling to erase their memory cells. 

Quantum mechanics primarily applies to the atomic regimes of matter and energy, but some systems exhibit 
quantum mechanical effects on a large scale; superfluidity (the frictionless flow of a liquid at temperatures near 
absolute zero) is one well-known example. Quantum theory also provides accurate descriptions for many previously 
unexplained phenomena such as black body radiation and the stability of electron orbitals. It has also given insight 
into the workings of many different biological systems, including smell receptors and protein structures.^ Recent 
work on photosynthesis has provided evidence that quantum correlations play an essential role in this most 
fundamental process of the plant kingdom J 43 ^ Even so, classical physics often can be a good approximation to 
results otherwise obtained by quantum physics, typically in circumstances with large numbers of particles or large 
quantum numbers. (However, some open questions remain in the field of quantum chaos.) 



Result: |Bands+Transmission+CurrentDensity+IV ^| 


















0.3 - 
























































o 
id 
— i 


._. 0.2 - 

> 

1--, 

* 0.1 - 

iZ 
LU 

0 - 




























crated with RTD-NEGF on nanoHl 






























































































































1 ' 1 ' 1 ' 1 

0 12 3 
Composite Plot Axis 

1 Bands+Transmission+CurrentDensity+IV =1 Bias=0V 

.dp 




I 

4 
I 


- c 

<D 
Cn 
<D 
Cn 

E 

Optio 



A working mechanism of a resonant tunneling diode device, based on the phenomenon of 
quantum tunneling through the potential barriers. 



Quantum mechanics 



12 



Examples 



Particle in a box 



The particle in a 1 -dimensional potential energy box is the most simple 
example where restraints lead to the quantization of energy levels. The 
box is defined as having zero potential energy inside a certain region 
and infinite potential energy everywhere outside that region. For the 
1 -dimensional case in the x direction, the time-independent 
Schrodinger equation can be written as:'- 44 -' 




o L x 

1 -dimensional potential energy box (or infinite 
potential well) 



- Eip. 



2m dx 2 

Writing the differential operator 

•* d 

Px = -in— 
dx 

the previous equation can be seen to be evocative of the classic analogue 
— f = E 

with E as me energy for the state if) , in this case coinciding with the kinetic energy of the particle. 
The general solutions of the Schrodinger equation for the particle in a box are: 

h 2 k 2 

i>tx) = Ae ikx + Be~ ikx E = — - 

2m 

or, from Euler's formula, 

ip(x) = C sin kx + D cos kx. 

The presence of the walls of the box determines the values of C, D, and k. At each wall (x = 0 and x = L), xp = 0. 
Thus when x - 0, 

^(0) = 0 = Csin0 + Dcos0 = D 

and so D = 0. When x - L, 

ijj(L) = 0 = CsinfcL. 

C cannot be zero, since this would conflict with the Born interpretation. Therefore sin kL = 0, and so it must be that 
kL is an integer multiple of it. Therefore, 

k = n = 1, 2, 3, ... . 

L 

The quantization of energy levels follows from this constraint on k, since 
n 7r n n h 



E 



2mL 2 8mL 2 ' 



Quantum mechanics 



13 



Free particle 

For example, consider a free particle. 
In quantum mechanics, there is 
wave-particle duality so the properties 
of the particle can be described as the 
properties of a wave. Therefore, its 
quantum state can be represented as a 
wave of arbitrary shape and extending 
over space as a wave function. The 
position and momentum of the particle 
are observables. The Uncertainty 
Principle states that both the position 
and the momentum cannot 
simultaneously be measured with full 
precision at the same time. However, 
one can measure the position alone of a 
moving free particle creating an eigenstate of position with a wavefunction that is very large (a Dirac delta) at a 
particular position x and zero everywhere else. If one performs a position measurement on such a wavefunction, the 
result x will be obtained with 100% probability (full certainty). This is called an eigenstate of position 
(mathematically more precise: a generalized position eigenstate (eigendistribution)). If the particle is in an eigenstate 
of position then its momentum is completely unknown. On the other hand, if the particle is in an eigenstate of 
momentum then its position is completely unknown J 45 ^ In an eigenstate of momentum having a plane wave form, it 
can be shown that the wavelength is equal to h/p, where h is Planck's constant and p is the momentum of the 

* * [46] 

eigenstate. 



Notes 

[I] J. Mehra and H. Rechenberg, The historical development of quantum theory, Springer-Verlag, 1982. 

[2] T.S. Kuhn, Black-body theory and the quantum discontinuity 1894-1912, Clarendon Press, Oxford, 1978. 

[3] A. Einstein, Uber einen die Erzeugung und Verwandlung des Lichtes betreffenden heuristischen Gesichtspunkt (On a heuristic point of view 
concerning the production and transformation of light), Annalen der Physik 17 (1905) 132-148 (reprinted in The collected papers of Albert 
Einstein, John Stachel, editor, Princeton University Press, 1989, Vol. 2, pp. 149-166, in German; see also Einstein's early work on the 
quantum hypothesis, ibid. pp. 134-148). 

[4] "Merriam-Webster.com" (http://www.merriam-webster.com/dictionary/quantum). Merriam-Webster.com. 2010-08-13. . Retrieved 
2010-10-15. 

[5] Edwin Thall. "FCCJ.org" (http://mooni.fccj.org/~ethall/quantum/quant.htm). Mooni.fccj.org. . Retrieved 2010-10-15. 
[6] Compare the list of conferences presented here (http://ysfine.com/). 

[7] Oocities.com (http://web.archive.Org/web/20091026095410/http://geocities.com/mik_malm/quantmech.html) 
[8] P.A.M. Dirac, The Principles of Quantum Mechanics, Clarendon Press, Oxford, 1930. 

[9] J. von Neumann, Mathematische Grundlagen der Quantenmechanik, Springer, Berlin, 1932 (English translation: Mathematical Foundations 

of Quantum Mechanics, Princeton University Press, 1955). 
[10] Greiner, Walter; Mtiller, Berndt (1994). Quantum Mechanics Symmetries, Second edition (http://books.google.com/ 

books 7id=gCfvWx6vuzUC&pg=PA52). Springer-Verlag. p. 52. ISBN 3-540-58080-8. ., 

[II] "AIP.org" (http://www.aip.org/history/heisenberg/p08a.htm). AIP.org. . Retrieved 2010-10-15. 

[12] Greenstein, George; Zajonc, Arthur (2006). The Quantum Challenge: Modern Research on the Foundations of Quantum Mechanics, Second 
edition (http://books.google.com/books ?id=5t0tm0FBlCsC&pg=PA215). Jones and Bartlett Publishers, Inc. p. 215. ISBN 0-7637-2470-X. 

[13] probability clouds are approximate, but better than the Bohr model, whereby electron location is given by a probability function, the wave 

function eigenvalue, such that the probability is the squared modulus of the complex amplitude 
[14] "Actapress.com" (http://www.actapress.com/PaperInfo.aspx?PaperID=25988&reason=500). Actapress.com. . Retrieved 2010-10-15. 
[15] Hirshleifer, Jack (2001). The Dark Side of the Force: Economic Foundations of Conflict Theory (http://books.google.com/ 

books?id=W2J2IXgiZVgC&pg=PA265). Campbridge University Press, p. 265. ISBN 0-521-80412-4. ., 



Result: 1 3D Wavefunctions ^| Result: 1 3D Wavefunctions 




3D confined electron wave functions for each eigenstate in a Quantum Dot. Here, 
rectangular and triangular-shaped quantum dots are shown. Energy states in rectangular 
dots are more 's-type' and 'p-type'. However, in a triangular dot the wave functions are 
mixed due to confinement symmetry. 



Quantum mechanics 



14 



[16] Dict.cc (http://www.dict.cc/german-english/eigen.html) 

De .pons . eu (http :// de. pons . eu/ deutsch-englisch/eigen) 
[17] "PHY.olemiss.edu" (http://www.phy.olemiss.edu/~luca/Topics/qm/collapse.html). PHY.olemiss.edu. 2010-08-16. . Retrieved 

2010-10-15. 

[18] "Farside.ph.utexas.edu" (http://farside.ph.utexas.edu/teaching/qmech/lectures/node28.html). Farside.ph.utexas.edu. . Retrieved 
2010-10-15. 

[19] "Reddit.com" (http://www.reddit.eom/r/philosophy/comments/8p2qv/determinism_and_naive_realism/). Reddit.com. 2009-06-01. . 
Retrieved 2010-10-15. 

[20] Michael Trott. "Time-Evolution of a Wavepacket in a Square Well — Wolfram Demonstrations Project" (http: //demonstrations. wolfram. 

com/TimeEvolutionOfAWavepacketlnASquareWell/). Demonstrations.wolfram.com. . Retrieved 2010-10-15. 
[21] Michael Trott. "Time Evolution of a Wavepacket In a Square Well" (http://demonstrations.wolfram.com/ 

TimeEvolutionOfAWavepacketlnASquareWell/). Demonstrations.wolfram.com. . Retrieved 2010-10-15. 
[22] Mathews, Piravonu Mathews; Venkatesan, K. (1976). A Textbook of Quantum Mechanics (http://books.google.com/ 

books ?id=_qzslDD3TcsC&pg=PA36). Tata McGraw-Hill. p. 36. ISBN 0-07-096510-2. ., 
[23] "Wave Functions and the Schrodinger Equation" (http://physics.ukzn.ac.za/~petruccione/Physl20/Wave Functions and the Schrodinger 

Equation.pdf) (PDF). . Retrieved 2010-10-15. 
[24] "Spaceandmotion.com" (http://www.spaceandmotion.com/physics-quantum-mechanics-werner-heisenberg.htm). Spaceandmotion.com. . 

Retrieved 2010-10-15. 

[25] Especially since Werner Heisenberg was awarded the Nobel Prize in Physics in 1932 for the creation of quantum mechanics, the role of Max 
Born has been obfuscated. A 2005 biography of Born details his role as the creator of the matrix formulation of quantum mechanics. This was 
recognized in a paper by Heisenberg, in 1940, honoring Max Planck. See: Nancy Thorndike Greenspan, "The End of the Certain World: The 
Life and Science of Max Born" (Basic Books, 2005), pp. 124 - 128, and 285 - 286. 

[26] "IF.uj.edu.pl" (http://th-www.if.uj.edu.pl/acta/voll9/pdf/vl9p0683.pdf) (PDF). . Retrieved 2010-10-15. 

[27] "OCW.ssu.edu" (http://ocw.usu.edu/physics/classical-mechanics/pdf_lectures/06.pdf) (PDF). . Retrieved 2010-10-15. 

[28] "The Nobel Prize in Physics 1979" (http://nobelprize.org/nobel_prizes/physics/laureates/1979/index.html). Nobel Foundation. . 
Retrieved 2010-02-16. 

[29] Complex Elliptic Pendulum (http://arxiv.org/abs/1001.0131), Carl M. Bender, Daniel W. Hook, Karta Kooner 

[30] "Scribd.com" (http://www.scribd.com/doc/5998949/Quantum-mechanics-course-iwhatisquantummechanics). Scribd.com. 2008-09-14. . 
Retrieved 2010-10-15. 

[31] Philsci-archive.pitt.edu (http://philsci-archive.pitt.edu/archive/00002328/01/handbook.pdf) 

[32] "Academic.brooklyn.cuny.edu" (http://academic.brooklyn.cuny.edu/physics/sobel/Nucphys/atomprop.html). 

Academic.brooklyn.cuny.edu. . Retrieved 2010-10-15. 
[33] "Cambridge.org" (http://assets.cambridge.org/97805218/29526/excerpt/9780521829526_excerpt.pdf) (PDF). . Retrieved 2010-10-15. 
[34] "There is as yet no logically consistent and complete relativistic quantum field theory.", p. 4. — V. B. Berestetskii, E. M. Lifshitz, L P 

Pitaevskii (1971). J. B. Sykes, J. S. Bell (translators). Relativistic Quantum Theory 4, part I. Course of Theoretical Physics (Landau and 

Lifshitz) ISBN 0080160255 

[35] "Life on the lattice: The most accurate theory we have" (http://latticeqcd.blogspot.com/2005/06/most-accurate-theory-we-have.html). 

Latticeqcd.blogspot.com. 2005-06-03. . Retrieved 2010-10-15. 
[36] Parker, B. (1993). Overcoming some of the problems, pp. 259-279. 

[37] The Character of Physical Law (1965) Ch. 6; also quoted in The New Quantum Universe (2003) by Tony Hey and Patrick Walters 

[38] "Plato.stanford.edu" (http://plato.stanford.edu/entries/qm-action-distance/). Plato.stanford.edu. 2007-01-26. . Retrieved 2010-10-15. 

[39] "Plato.stanford.edu" (http://plato.stanford.edu/entries/qm-everett/). Plato.stanford.edu. . Retrieved 2010-10-15. 

[40] "Books.google.com" (http: //books. google.com/books ?id=vdXU6SD4_UYC). Books.google.com. . Retrieved 2010-10-23. 

[4 1 ] " en. wikiboos . org " (http :// en. wikibooks . org/ wiki/ Computational_chemistry/ Applications_of_molecular_quantum_mechanics) . 

En.wikibooks.org. . Retrieved 2010-10-23. 
[42] Anderson, Mark (2009-01-13). "Discovermagazine.com" (http://discovermagazine.com/2009/feb/ 

13-is-quantum-mechanics-controlling-your-thoughts/article_view?b_start:int=l&-C). Discovermagazine.com. . Retrieved 2010-10-23. 
[43] "Quantum mechanics boosts photosynthesis" (http://physicsworld.com/cws/article/news/41632). physicsworld.com. . Retrieved 

2010-10-23. 

[44] Derivation of particle in a box, chemistry.tidalswan.com (http://chemistry. tidalswan.com/index.php ?title=Quantum_Mechanics) 

[45] Davies, P. C. W.; Betts, David S. (1984). Quantum Mechanics, Second edition (http://books. google. com/books ?id=XRyHCrGNstoC& 

pg=PA79). Chapman and Hall. p. 79. ISBN 0-7487-4446-0. ., 
[46] "Books.Google.com" (http ://books. google. com/books ?id=tKm-Ekwke_UC). Books.Google.com. 2007-08-30. . Retrieved 2010-10-23. 



Quantum mechanics 



15 



References 

The following titles, all by working physicists, attempt to communicate quantum theory to lay people, using a 
minimum of technical apparatus. 

• Chester, Marvin (1987) Primer of Quantum Mechanics. John Wiley. ISBN 0-486-42878-8 

• Richard Feynman, 1985. QED: The Strange Theory of Light and Matter, Princeton University Press. ISBN 
0-691-08388-6. Four elementary lectures on quantum electrodynamics and quantum field theory, yet containing 
many insights for the expert. 

• Ghirardi, GianCarlo, 2004. Sneaking a Look at God's Cards, Gerald Malsbary, trans. Princeton Univ. Press. The 
most technical of the works cited here. Passages using algebra, trigonometry, and bra-ket notation can be passed 
over on a first reading. 

• N. David Mermin, 1990, "Spooky actions at a distance: mysteries of the QT" in his Boojums all the way through. 
Cambridge University Press: 110-76. 

• Victor Stenger, 2000. Timeless Reality: Symmetry, Simplicity, and Multiple Universes. Buffalo NY: Prometheus 
Books. Chpts. 5-8. Includes cosmological and philosophical considerations. 

More technical: 

• Bryce DeWitt, R. Neill Graham, eds., 1973. The Many-Worlds Interpretation of Quantum Mechanics, Princeton 
Series in Physics, Princeton University Press. ISBN 0-691-081 3 1-X 

• Dirac, P. A. M. (1930). The Principles of Quantum Mechanics. ISBN 01985201 15. The beginning chapters make 
up a very clear and comprehensible introduction. 

• Hugh Everett, 1957, "Relative State Formulation of Quantum Mechanics," Reviews of Modern Physics 29: 
454-62. 

• Feynman, Richard P.; Leighton, Robert B.; Sands, Matthew (1965). The Feynman Lectures on Physics. 1-3. 
Addison- Wesley. ISBN 0738200085. 

• Griffiths, David J. (2004). Introduction to Quantum Mechanics (2nd ed). Prentice Hall. ISBN 0-13-1 1 1892-7. 
OCLC 40251748. A standard undergraduate text. 

• Max Jammer, 1966. The Conceptual Development of Quantum Mechanics. McGraw Hill. 

• Hagen Kleinert, 2004. Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets, 
3rd ed. Singapore: World Scientific. Draft of 4th edition, (http://www.physik.fu-berlin.de/~kleinert/b5) 

• Gunther Ludwig, 1968. Wave Mechanics. London: Pergamon Press. ISBN 0-08-203204-1 

• George Mackey (2004). The mathematical foundations of quantum mechanics. Dover Publications. ISBN 
0-486-43517-2. 

• Albert Messiah, 1966. Quantum Mechanics (Vol. I), English translation from French by G. M. Temmer. North 
Holland, John Wiley & Sons. Cf. chpt. IV, section III. 

• Omnes, Roland (1999). Understanding Quantum Mechanics. Princeton University Press. ISBN 0-691-00435-8. 
OCLC 39849482. 

• Scerri, Eric R., 2006. The Periodic Table: Its Story and Its Significance. Oxford University Press. Considers the 
extent to which chemistry and the periodic system have been reduced to quantum mechanics. ISBN 
0-19-530573-6 

• Transnational College of Lex (1996). What is Quantum Mechanics? A Physics Adventure. Language Research 
Foundation, Boston. ISBN 0-9643504-1-6. OCLC 34661512. 

• von Neumann, John (1955). Mathematical Foundations of Quantum Mechanics. Princeton University Press. 
ISBN 0691028931. 

• Hermann Weyl, 1950. The Theory of Groups and Quantum Mechanics, Dover Publications. 

• D. Greenberger, K. Hentschel, F. Weinert, eds., 2009. Compendium of quantum physics, Concepts, experiments, 
history and philosophy, Springer- Verlag, Berlin, Heidelberg. 



Quantum mechanics 



16 



Further reading 

• Bernstein, Jeremy (2009). Quantum Leaps (http://books. google. com/books 7id=jOMe3brYOLOC& 
printsec=frontcover). Cambridge, Massachusetts: Belknap Press of Harvard University Press. 
ISBN 9780674035416. 

• Bohm, David (1989). Quantum Theory. Dover Publications. ISBN 0-486-65969-0. 

• Eisberg, Robert; Resnick, Robert (1985). Quantum Physics of Atoms, Molecules, Solids, Nuclei, and Particles 
(2nd ed). Wiley. ISBN 0-471-87373-X. 

• Liboff, Richard L. (2002). Introductory Quantum Mechanics. Addison- Wesley. ISBN 0-8053-8714-5. 

• Merzbacher, Eugen (1998). Quantum Mechanics. Wiley, John & Sons, Inc. ISBN 0-471-88702-1. 

• Sakurai, J. J. (1994). Modern Quantum Mechanics. Addison Wesley. ISBN 0-201-53929-2. 

• Shankar, R. (1994). Principles of Quantum Mechanics. Springer. ISBN 0-306-44790-8. 

External links 

• The Modern Revolution in Physics (http://www.lightandmatter.com/html_books/6mr/ch01/ch01.html) - an 
online textbook. 

• J. O'Connor and E. F. Robertson: A history of quantum mechanics, (http://www-history.mcs.st-andrews.ac.uk/ 
history/HistTopics/The_Quantum_age_begins . html) 

• Introduction to Quantum Theory at Quantiki. (http://www.quantiki.org/wiki/index.php/ 
Introduction_to_Quantum_Theory) 

• Quantum Physics Made Relatively Simple (http://bethe.cornell.edu/): three video lectures by Hans Bethe 

• H is for h-bar. (http://www.nonlocal.com/hbar/) 

• Quantum Mechanics Books Collection (http://www.freebookcentre.net/Physics/Quantum-Mechanics-Books. 
html): Collection of free books 

Course material 

• Doron Cohen: Lecture notes in Quantum Mechanics (comprehensive, with advanced topics), (http://arxiv.org/ 
abs/quant-ph/0605180) 

• MIT OpenCourseWare: Chemistry (http://ocw.mit.edu/OcwWeb/Chemistry/index.htm). 

• MIT OpenCourseWare: Physics (http://ocw.mit.edu/OcwWeb/Physics/index.htm). See 8.04 (http://ocw. 
mit.edu/OcwWeb/Physics/8-04Spring-2006/CourseHome/index.htm) 

• Stanford Continuing Education PHY 25: Quantum Mechanics (http://www.youtube.eom/stanford#g/c/ 
84C10A9CB1D13841) by Leonard Susskind, see course description (http://continuingstudies.stanford.edu/ 
courses/course.php?cid=20072_PHY 25) Fall 2007 

• 5V2 Examples in Quantum Mechanics (http://www.physics.csbsju.edu/QM/) 

• Imperial College Quantum Mechanics Course, (http://www.imperial.ac.uk/quantuminformation/qi/tutorials) 

• Spark Notes - Quantum Physics, (http://www.sparknotes.com/testprep/books/sat2/physics/ 
chapter 1 9 section3 .rhtml) 

• Quantum Physics Online : interactive introduction to quantum mechanics (RS applets). (http://www. 
quantum-physic s . poly technique . f r/) 

• Experiments to the foundations of quantum physics with single photons. (http://www.didaktik.physik. 
uni-erlangen.de/quantumlab/english/index.html) 

• Motion Mountain, Volume IV (http://www.motionmountain.net/download.html) - A modern introduction to 
quantum theory, with several animations. 

• AQME (http://www.nanohub.org/topics/AQME) : Advancing Quantum Mechanics for Engineers — by 
T.Barzso, D.Vasileska and G.Klimeck online learning resource with simulation tools on nanohub 

• Quantum Mechanics (http://www.lsr.ph.ic.ac.uk/~plenio/lecture.pdf) by Martin Plenio 

• Quantum Mechanics (http://farside.ph.utexas.edu/teaching/qm/389.pdf) by Richard Fitzpatrick 



Quantum mechanics 



17 



• Online course on Quantum Transport (http://nanohub.org/resources/2039) 
FAQs 

• Many-worlds or relative- state interpretation, (http://www.hedweb.com/manworld.htm) 

• Measurement in Quantum mechanics, (http://www.mtnmath.com/faq/meas-qm.html) 

Media 

• Lectures on Quantum Mechanics by Leonard Susskind (http://www.youtube.com/ 
view_play_list?p=84C 1 0A9CB 1 D 1 3 84 1 ) 

• Everything you wanted to know about the quantum world (http://www.newscientist.com/channel/ 
fundamentals/quantum- world) — archive of articles from New Scientist. 

• Quantum Physics Research (http://www.sciencedaily.com/news/matter_energy/quantum_physics/) from 
Science Daily 

• Overbye, Dennis (December 27, 2005). "Quantum Trickery: Testing Einstein's Strangest Theory" (http://www. 
nytimes.com/2005/12/27/science/27eins.html?ex=1293339600&en=caf5d835203c3500&ei=5090). The 
New York Times. Retrieved April 12, 2010. 

• Audio: Astronomy Cast (http://www.astronomycast.com/physics/ep-138-quantum-mechanics/) Quantum 
Mechanics — June 2009. Fraser Cain interviews Pamela L. Gay. 

Philosophy 

• "Quantum Mechanics" (http://plato.stanford.edu/entries/qm) entry by Jenann Ismael. in the Stanford 
Encyclopedia of Philosophy 

• "Measurement in Quantum Theory" (http://plato.stanford.edu/entries/qm) entry by Henry Krips. in the 

Stanford Encyclopedia of Philosophy 

Schrodinger equation 

In physics, specifically quantum mechanics, the Schrodinger equation, formulated in 1926 by Austrian physicist 
Erwin Schrodinger, is an equation that describes how the quantum state of a physical system changes in time. It is as 
central to quantum mechanics as Newton's laws are to classical mechanics. 

dt 

Two forms of the Schrodinger equation 

In the standard interpretation of quantum mechanics, the quantum state, also called a wavefunction or state vector, is 

the most complete description that can be given to a physical system. Solutions to Schrodinger's equation describe 

not only molecular, atomic and subatomic systems, but also macroscopic systems, possibly even the whole universe. 
[1] 

The most general form is the time-dependent Schrodinger equation, which gives a description of a system evolving 
with time. For systems in a stationary state, the time-independent Schrodinger equation is sufficient. Approximate 
solutions to the time-independent Schrodinger equation are commonly used to calculate the energy levels and other 
properties of atoms and molecules. 

Schrodinger's equation can be mathematically transformed into Werner Heisenberg's matrix mechanics, and into 
Richard Feynman's path integral formulation. The Schrodinger equation describes time in a way that is inconvenient 
for relativistic theories, a problem which is not as severe in matrix mechanics and completely absent in the path 
integral formulation. 



Schrodinger equation 



18 



The Schrodinger equation 

The Schrodinger equation takes several different forms, depending on the physical situation. This section presents 
the equation for the general case and for the simple case encountered in many textbooks. 

General quantum system 

For a general quantum system: ^ 

d 

ih—^ = 

at 

where 

• \Jris the wave function; the probability amplitude for different configurations of the system at different times, 

d 

• ifi — is the energy operator ( i is the imaginary unit and ft is the reduced Planck constant), 

dt 

• is the Hamiltonian operator. 

Single particle in a potential 

For a single particle with potential energy V in position space, the Schrodinger equation takes the form. 

•ft^*(r, f) = H9 = (-|^V 2 + V{r)j tt(r, t) = -^V 2 *<r, t)+V(r)*(r, t) 

where 

h 2 

• V 2 ^ s me kinetic energy operator, where m is the mass of the particle. 

2m 

Q2 Q2 Q2 

• V 2 * s me Laplace operator. In three dimensions, the Laplace operator is ^ -| ^ -| ^, where x, y, 

dx 2 dy 2 dz l 

and z are the Cartesian coordinates of space. 

• V (r) is the time-independent potential energy at the position r. 

• ^(r, t) is the probability amplitude for the particle to be found at position r at time t. 

• H = ^— V 2 + V(r) i s me Hamiltonian operator for a single particle in a potential. 

Time independent or stationary equation 

The time independent equation, again for a single particle with potential energy V takes the form:'- 4 -' 

h 2 

Eif>(r) = -— Vfy(r) + V(r)if>(r). 
2m 

This equation describes the standing wave solutions of the time-dependent equation, which are the states with 
definite energy. 



Schrodinger equation 



19 



Historical background and development 

Following Max Planck's quantization of light (see black body radiation), Albert Einstein interpreted Planck's 
quantum to be photons, particles of light, and proposed that the energy of a photon is proportional to its frequency, 
one of the first signs of wave-particle duality. Since energy and momentum are related in the same way as frequency 
and wavenumber in special relativity, it followed that the momentum p of a photon is proportional to its wavenumber 
k. 

h fit 

p — — — nk 

A 

Louis de Broglie hypothesized that this is true for all particles, even particles such as electrons. Assuming that the 
waves travel roughly along classical paths, he showed that they form standing waves for certain discrete frequencies. 
These correspond to discrete energy levels, which reproduced the old quantum condition 

Following up on these ideas, Schrodinger decided to find a proper wave equation for the electron. He was guided by 
William R. Hamilton's analogy between mechanics and optics, encoded in the observation that the zero-wavelength 
limit of optics resembles a mechanical system — the trajectories of light rays become sharp tracks which obey 
Fermat's principle, an analog of the principle of least action J 6 ^ A modern version of his reasoning is reproduced in 
the next section. The equation he found is: 

d h 2 
ih^(x, t) = -^V 2 *(x, t) + V(x)*(x, t). 

Using this equation, Schrodinger computed the hydrogen spectral series by treating a hydrogen atom's electron as a 
wave W(x, t), moving in a potential well V, created by the proton. This computation accurately reproduced the energy 
levels of the Bohr model. 

However, by that time, Arnold Sommerfeld had refined the Bohr model with relativistic corrections ^ 
Schrodinger used the relativistic energy momentum relation to find what is now known as the Klein-Gordon 
equation in a Coulomb potential (in natural units): 

He found the standing waves of this relativistic equation, but the relativistic corrections disagreed with Sommerfeld's 
formula. Discouraged, he put away his calculations and secluded himself in an isolated mountain cabin with a 
lover. 1 

While at the cabin, Schrodinger decided that his earlier non-relativistic calculations were novel enough to publish, 
and decided to leave off the problem of relativistic corrections for the future. He put together his wave equation and 
the spectral analysis of hydrogen in a paper in 1926.^ 10 ^ The paper was enthusiastically endorsed by Einstein, who 
saw the matter-waves as an intuitive depiction of nature, as opposed to Heisenberg's matrix mechanics, which he 
considered overly formal.^ 1 ^ 

The Schrodinger equation details the behaviour of ip but says nothing of its nature. Schrodinger tried to interpret it as 

ri2i 

a charge density in his fourth paper, but he was unsuccessful. In 1926, just a few days after Schrodinger' s fourth 

ri3i 

and final paper was published, Max Born successfully interpreted xp as a probability amplitude. Schrodinger, 
though, always opposed a statistical or probabilistic approach, with its associated discontinuities — much like 
Einstein, who believed that quantum mechanics was a statistical approximation to an underlying deterministic 
theory — and never reconciled with the Copenhagen interpretation J- 14] 




Schrodinger equation 



20 



Derivation 

Short heuristic derivation 

Schrodinger's equation can be derived in the following short heuristic way. 
Assumptions 

1 . The total energy E of a particle is 

E = T + V=^- + V. 

This is the classical expression for a particle with mass m where the total energy E is the sum of the kinetic 
energy T, and the potential energy V (which can vary with position, and time), p and m are respectively the 
momentum and the mass of the particle. 

2. Einstein's light quanta hypothesis of 1905, which asserts that the energy E of a photon is proportional to the 
frequency v (or angular frequency, co = 2jtv) of the corresponding electromagnetic wave: 

E — hi/ — Tiuj , 

3. The de Broglie hypothesis of 1924, which states that any particle can be associated with a wave, and that the 
momentum p of the particle is related to the wavelength A (or wavenumber k) of such a wave by: 

p = X = hk 5 

Expressing p and k as vectors, we have 

p = Kk . 

4. The three assumptions above allow one to derive the equation for plane waves only. To conclude that it is true in 
general requires the superposition principle, and thus, one must separately postulate that the Schrodinger equation 
is linear. 

Expressing the wave function as a complex plane wave 

Schrodinger's idea was to express the phase of a plane wave as a complex phase factor: 

and to realize that since 
d 

— * = -im^ 
dt 

then 

d 

= Tiu)^ = ih—^ 
at 

and similarly since 

d 
OX 

and 

d 2 



dx 2 

we find: 

so that, again for a plane wave, he obtained: 



Schrodinger equation 



21 



And, by inserting these expressions for the energy and momentum into the classical formula we started with, we get 
Schrodinger's famed equation, for a single particle in the 3-dimensional case in the presence of a potential V: 

d h 2 
at 2m 

Versions 

There are several equations that go by Schrodinger's name: 
Time dependent equation 

This is the equation of motion for the quantum state. In the most general form, it is written: ^ 15 ^ 

ih^(x, t) = HV(x, t) 

where jj is a linear operator acting on the wavefunction For the specific case of a single particle in one 
dimension moving under the influence of a potential vJ 15 ^ 

t) = _ A- |!Ltt(x, t) + V(x)*(x, t) 

and the operator jj can be read off: 

For a particle in three dimensions, the only difference is more derivatives: 

d h 2 
ih— y, z, t) = -— V 2 *(x 5 y, z, t) + V(x, y, z)9(x, y, z, t) 

and for N particles, the difference is that the wavefunction is in 3,/V-dimensional configuration space, the space of all 
possible particle positions J 16 ^ 

This last equation is in a very high dimension, so that the solutions are not easy to visualize. 
Time independent equation 

This is the equation for the standing waves, the eigenvalue equation for . In abstract form, for a general quantum 
system, it is written: ^ 15 ^ 

= Eip. 

For a particle in one dimension, 

2 

But there is a further restriction — the solution must not grow at infinity, so that it has either a finite L -norm (if it is 

ri7i 

a bound state) or a slowly diverging norm (if it is part of a continuum). 
' 2 = / Mx)\ 2 dx. 



For example, when there is no potential, the equation reads:^ 18] 

-Eib = ir 

* 2m dx 2 



Schrodinger equation 



22 



which has oscillatory solutions for E > 0 (the are arbitrary constants): 

and exponential solutions for E < 0 

1>-\B\(x) = C ie V 2 ™|£|/& 2 * + C2e -sfi^EUtf* 
The exponentially growing solutions have an infinite norm, and are not physical. They are not allowed in a finite 
volume with periodic or fixed boundary conditions. 

For a constant potential V the solution is oscillatory for E > V and exponential for E < V, corresponding to energies 
which are allowed or disallowed in classical mechanics. Oscillatory solutions have a classically allowed energy and 
correspond to actual classical motions, while the exponential solutions have a disallowed energy and describe a small 
amount of quantum bleeding into the classically disallowed region, to quantum tunneling. If the potential V grows at 
infinity, the motion is classically confined to a finite region, which means that in quantum mechanics every solution 
becomes an exponential far enough away. The condition that the exponential is decreasing restricts the energy levels 
to a discrete set, called the allowed energies. 

Nonlinear equation 

ri9i 

The nonlinear Schrodinger equation is the partial differential equation (in dimensionless form) 

id t if; = - ^ + 
for the complex field ip(xj). 

ri9i 

This equation arises from the Hamiltonian 1 

H=jdx [^i 2 +^r 

with the Poisson brackets 

{^(x)^( y )} = {r(x) : r(y)} = o 

{ip*(x),ifj(y)} = iS(x-y). 
It must be noted that this is a classical field equation. Unlike its linear counterpart, it never describes the time 
evolution of a quantum state. 

Properties 

The Schrodinger equation has certain properties. 

Local conservation of probability 

The probability density of a particle is ^*(x 5 t) • The probability flux is defined as [in units of 

(probability )/(area x time)]: 

j = (**V* - = — Im (tf*V* ) . 

The probability flux satisfies the continuity equation: 

^P(x,i)+V-j = 0 

where P(z; 5 t) is the probability density [measured in units of (probability )/( volume)]. This equation is the 

mathematical equivalent of the probability conservation law. 
For a plane wave: 

(re, t) = Ae* kx -^ 



Schrodinger equation 



23 



j(x,t) = \Af™. 

m 

So that not only is the probability of finding the particle the same everywhere, but the probability flux is as expected 
from an object moving at the classical velocity p/m. The reason that the Schrodinger equation admits a probability 
flux is because all the hopping is local and forward in time. 

Relativity 

The Schrodinger equation does not take into account relativistic effects; as a wave equation, it is invariant under a 
Galilean transformation, but not under a Lorentz transformation. But in order to include relativity, the physical 
picture must be altered. 

The Klein-Gordon equation uses the relativistic mass-energy relation: 

E 2 =p 2 c 2 +m 2 c l 
to produce the differential equation: 

which is relativistically invariant. 

Solutions 

Some general techniques are: 

• Perturbation theory 

• The variational method 

• Quantum Monte Carlo methods 

• Density functional theory 

• The WKB approximation and semi-classical expansion 
In some special cases, special methods can be used: 

• List of quantum-mechanical systems with analytical solutions 

• Hartree-Fock method and post Hartree-Fock methods 

• Discrete delta-potential method 

Notes 

[I] Schrodinger, E. (1926). "An Undulatory Theory of the Mechanics of Atoms and Molecules" (http://home.tiscali.nl/physis/HistoricPaper/ 
Schroedinger/Schroedingerl926c.pdf). Physical Review 28 (6): 1049-1070. doi: 10.1 103/PhysRev.28. 1049. . 

[2] Shankar, R. (1994). Principles of Quantum Mechanics (2nd ed.). Kluwer Academic/Plenum Publishers, p. 143. ISBN 978-0-306-44790-7. 
[3] Shankar, R. (1994). Principles of Quantum Mechanics (2nd ed.). Kluwer Academic/Plenum Publishers, p. 143ff. ISBN 978-0-306-44790-7. 
[4] Shankar, R. (1994). Principles of Quantum Mechanics (2nd ed.). Kluwer Academic/Plenum Publishers, p. 145. ISBN 978-0-306-44790-7. 
[5] de Broglie, L. (1925). "Recherches sur la theorie des quanta [On the Theory of Quanta]" (http://tel.archives-ouvertes.fr/docs/00/04/70/ 

78/PDF/tel-00006807.pdf). Annales de Physique 10 (3): 22-128. . Translated version (http://www.ensmp.fr/aflb/LDB-oeuvres/ 

De_Broglie_Kracklauer . pdf) . 

[6] Schrodinger, E. (1984). Collected papers. Friedrich Vieweg und Sohn. ISBN 3700105738. See introduction to first 1926 paper. 

[7] Sommerfeld, A. (1919). Atombau und Spektrallinien. Braunschweig: Friedrich Vieweg und Sohn. ISBN 3871444847. 

[8] For an English source, see Haar, T.. The Old Quantum Theory. 

[9] Rhodes, R. (1986). Making of the Atomic Bomb. Touchstone. ISBN 0-671-44133-7. 

[10] Schrodinger, E. (1926). "Quantisierung als Eigenwertproblem; von Erwin Schrodinger" (http://gallica.bnf. fr/ark:/l 2 148/bpt6kl 53811. 
image. langFR.f373. pagination). Annalen der Physik, (Leipzig): 361-377. . 

[II] Einstein, A.; et. al.. Letters on Wave Mechanics: Schrodinger-Planck-Einstein-Lorentz. 

[12] Moore, W.J. (1992). Schrodinger: Life and Thought. Cambridge University Press, p. 219. ISBN 0-521-43767-9. 
[13] Moore, W.J. (1992). Schrodinger: Life and Thought. Cambridge University Press, p. 220. ISBN 0-521-43767-9. 



Schrodinger equation 



24 



[14] It is clear that even in his last year of life, as shown in a letter to Max Born, that Schrodinger never accepted the Copenhagen interpretation. 

cf p. 220 Moore, WJ. (1992). Schrodinger: Life and Thought. Cambridge University Press, p. 479. ISBN 0-521-43767-9. 
[15] Shankar, R. (1994). Principles of Quantum Mechanics. Kluwer Academic/Plenum Publishers, pp. U3jf. ISBN 978-0-306-44790-7. 
[16] Shankar, R. (1994). Principles of Quantum Mechanics. Kluwer Academic/Plenum Publishers, p. 141. ISBN 978-0-306-44790-7. 
[17] Feynman, R.P.; Leighton, R.B.; Sand, M. (1964). "Operators". The Feynman Lectures on Physics. 3. Addison- Wesley, pp. 20-7. 

ISBN 0201021 153. 

[18] Shankar, R. (1994). Principles of Quantum Mechanics. Kluwer Academic/Plenum Publishers, pp. \5lff. ISBN 978-0-306-44790-7. 
[19] V.E. Zakharov; S.V. Manakov (1974). "On the complete integrability of a nonlinear Schrodinger equation". Journal of Theoretical and 

Mathematical Physics 19 (3): 551-559. doi:10.1007/BF01035568. Originally in: Teoreticheskaya i Matematicheskaya Fizika 19 (3): 332-343. 

June, 1974 

References 

• Paul Adrien Maurice Dirac (1958). The Principles of Quantum Mechanics (4th ed.). Oxford University Press. 

• David J. Griffiths (2004). Introduction to Quantum Mechanics (2nd ed.). Benjamin Cummings. 
ISBN 0131244051. 

• Richard Liboff (2002). Introductory Quantum Mechanics (4th ed.). Addison Wesley. ISBN 0805387145. 

• David Halliday (2007). Fundamentals of Physics (8th ed.). Wiley. ISBN 0471 159506. 

• Serway, Moses, and Moyer (2004). Modern Physics (3rd ed.). Brooks Cole. ISBN 0534493408. 

• Walter John Moore (1992). Schrodinger: Life and Thought. Cambridge University Press. ISBN 0521437679. 

• Schrodinger, Erwin (December 1926). "An Undulatory Theory of the Mechanics of Atoms and Molecules". Phys. 
Rev. 28 (6) 28: 1049-1070. doi:10.1103/PhysRev.28.1049. 

External links 

• Quantum Physics (http://www.lightandmatter.com/html_books/0sn/chl3/chl3.html) - a textbook with a 
treatment of the time-independent Schrodinger equation 

• Linear Schrodinger Equation (http://eqworld.ipmnet.ru/en/solutions/lpde/lpdel08.pdf) at EqWorld: The 
World of Mathematical Equations. 

• Nonlinear Schrodinger Equation (http://eqworld.ipmnet.ru/en/solutions/npde/npdel403.pdf) at EqWorld: 
The World of Mathematical Equations. 

• The Schrodinger Equation in One Dimension (http://www.colorado.edu/UCB/AcademicAffairs/ArtsSciences/ 
physics/TZD/PageProofsl/TAYL07-203-247.I.pdf) as well as the directory of the book (http://www. 
colorado.edu/UCB/AcademicAffairs/ArtsSciences/physics/TZD/PageProofsl/). 

• All about 3D Schrodinger Equation (http://hyperphysics.phy-astr.gsu.edu/hbase/hframe.html) 

• Mathematical aspects of Schrodinger equations are discussed on the Dispersive PDE Wiki (http://tosio.math. 
toronto . edu/ wiki/index . php/Main_Page) . 

• Web-Schrodinger: Interactive solution of the 2D time dependent Schrodinger equation (http://www. 
nano technology . hu/online/ web- schroedinger/index . html) 

• An alternate derivation of the Schrodinger Equation (http://behindtheguesses.blogspot.com/2009/06/ 
schrodinger-equation-corrections.html) 

• Online software- Periodic Potential Lab (http://nanohub.org/resources/3847) Solves the time independent 
Schrodinger equation for arbitrary periodic potentials. 



Dirac equation 



25 



Dirac equation 

The Dirac equation is a relativistic quantum mechanical wave equation formulated by British physicist Paul Dirac 
in 1928. It provides a description of elementary spin-Yz particles, such as electrons, consistent with both the 
principles of quantum mechanics and the theory of special relativity. The equation demands the existence of 
antiparticles and actually predated their experimental discovery. This made the discovery of the positron, the 
antiparticle of the electron, one of the greatest triumphs of modern theoretical physics. 



Mathematical formulation 

The Dirac equation in the form originally proposed by Dirac is: 



3 



(3mc + 2^ ot-kPk c j *) = in — ^ — 



where 

m is the rest mass of the electron, 

c is the speed of light, 

p is the momentum operator, 

x and t are the space and time coordinates, 

h = hllit is the reduced Planck constant, also known as Dirac's constant. 
The new elements in this equation are the 4x4 matrices <2fc and (3 , and the four-component wavefunction %j) . The 
matrices are all Hermitian and have squares equal to the identity matrix: 

<*\ = P = h 

and they all mutually anticommute: {g^ 5 ^j} = Oand |a^ ; /?} = 0 . Explicitly, 
a { {3 = -j3a h 

where i and j are distinct and range from 1 to 3. These matrices, and the form of the wavefunction, have a deep 
mathematical significance. The algebraic structure represented by the Dirac matrices had been created some 50 years 
earlier by the English mathematician W. K. Clifford. In turn, Clifford's formulation had been based on the mid- 19th 
century work of the German mathematician Hermann Grassmann in his "Lineare Ausdehnungslehre" (Theory of 
Linear Extensions). The latter had been regarded as well-nigh incomprehensible by most of his contemporaries. The 
appearance of something so seemingly abstract, at such a late date, in such a direct physical manner, is one of the 
most remarkable chapters in the history of physics. 

The commutation rules are designed so that a solution of Dirac's equation will automatically also be a solution of 
((mc 2 ) 2 + ^( Pfe c) 2 )V = £;V 

k = l 

which is the relativistic energy-momentum equation. 



Dirac equation 



26 



Comparison with the Schrodinger equation 

The Dirac equation is superficially similar to the Schrodinger equation for a free particle: 

The left side represents the square of the momentum operator divided by twice the mass, which is the nonrelativistic 
kinetic energy. A relativistic generalization of this equation requires that space and time derivatives must enter 
symmetrically, as they do in the relativistic Maxwell equations — the derivatives must be of the same order in space 
and time. In relativity, the momentum and the energy are the space and time parts of a geometrical space-time 
vector, the 4-momentum, and they are related by the relativistically invariant relation 

7712 

& 2 2 2 

— - p =m c 

which says that the length of this vector is the rest mass m. Replacing E and p by {fi — and —ifiS? as Schrodinger 

dt 

theory requires, we get a relativistic equation: 

and the wave function 0 is a relativistic scalar: a complex number which has the same numerical value in all 
frames. Because the equation is second order in the time derivative, one must specify the initial values of not only 
0 , but also of dt<j) . This is normal for classical waves, where the initial conditions are the position and velocity. 
However, in quantum mechanics, the wavefunction is supposed to be the complete description. That is, just knowing 
the wavefunction should determine the future. 

In the Schrodinger theory, the probability density is given by the positive definite expression 

P = </>*</> 

and its current by 

ih 

J = -—(<f>*VJ>-<i>V<!>*) 

and the conservation of probability density has a local form: 

In a relativistic theory, the form of the probability density and the current must form a four vector, so the form of the 
probability density can be found from the current just by replacing \7 by d t 

ih 

p=—(d>*d t d>-<i>d t <j)*). 

Everything is relativistic now, but the probability density is not positive definite, because the initial values of both (j) 
and dt(j) can be freely chosen. This expression reduces to Schrodinger' s density and current for superpositions of 
positive frequency waves whose wavelength is long compared to the Compton wavelength, that is, for nonrelativistic 
motions. It reduces to a negative definite quantity for superpositions of negative frequency waves only. It mixes up 
both signs when forces which have an appreciable amplitude to produce relativistic motions are involved, at which 
point scattering can produce particles and antiparticles. 

Although it was not a successful description of a single particle, this equation is resurrected in quantum field theory, 
where it is known as the Klein-Gordon equation, and describes a relativistic spin-0 complex field. The non-positive 
probability density and current are the charge-density and current, while the particles are described by a 
mode-expansion. 

To interpret the Klein-Gordon equation as an equation for the probability amplitude for a single particle at a given 
position, negative frequency solutions must be interpreted as describing the particle travelling backwards in time, 



Dirac equation 



27 



propagating into the past. 

The equation with this interpretation does not predict the future from the present except in the nonrelativistic limit, 
rather it places a global constraint on the amplitudes. This can be used to construct a perturbation expansion with 
particles zipping backwards and forwards in time, the Feynman diagrams, but it does not allow a straightforward 
wavefunction description, since each particle has its own separate proper time. Ultimately, the entity the 
Klein-Gordon equation (and Dirac equation) acts on is a field, not a wavefunction. The fields are physical fields 
whose values are observables. They are not probability amplitudes. 

Dirac's coup 

Thinking in terms of wavefunctions rather than fields, Dirac reasoned that the necessary equation is first-order in 
both space and time. One could formally take the relativistic expression for the energy E = C\Jp 2 + m 2 c 2 i 

replace p by its operator equivalent, expand the square root in an infinite series of derivative operators, set up an 
eigenvalue problem, then solve the equation formally by iterations. Most physicists had little faith in such a process, 
even if it were technically possible. 

As the story goes, Dirac was staring into the fireplace at Cambridge, pondering this problem, when he hit upon the 
idea of taking the square root of the wave operator thus: 
1 ^2 

V ' - = ( Ad * + Bd v + Cd * + -Dd t ){Ad x + Bd y + Cd z + -Ddt). 

c z ot z c c 

On multiplying out the right side we see that, to get all the cross-terms such as d x d y to vanish, we must assume 
AB + BA = 0, ... 

with 

A 2 = B 2 = . . . = 1. 

Dirac, who had just then been intensely involved with working out the foundations of Heisenberg's matrix 
mechanics, immediately understood that these conditions could be met if A, B... are matrices, with the implication 
that the wave function has multiple components. This immediately explained the appearance of two-component wave 
functions in Pauli's phenomenological theory of spin, something that up until then had been regarded as mysterious, 
even to Pauli himself. However, one needs at least 4x4 matrices to set up a system with the properties desired — so 
the wave function had four components, not two, as in the Pauli theory. 

Given the factorization in terms of these matrices, one can now write down immediately an equation 

(Ad x + Bd y + Cd z + -Dd t )^ = Kip 

c 

with k to be determined. Applying again the matrix operator on either side yields 

{v 2 - = *v 

On taking k = mc/Ti we find that all the components of the wave function individually satisfy the relativistic 
energy-momentum relation. Thus the sought-for equation that is first-order in both space and time is 

1 TfiC 

(Ad x + Bd y + cd z + -Dd t - -r)ip = o. 

c h 

With (j4 ? 5, C) = i/3ak and D — (3 , we get the Dirac equation. 



Dirac equation 



28 



Comparison with the Pauli theory 

The necessity of introducing half-integral spin goes back experimentally to the results of the Stern-Gerlach 
experiment. A beam of atoms is run through a strong inhomogeneous magnetic field, which then splits into N parts 
depending on the intrinsic angular momentum of the atoms. It was found that for silver atoms, the beam was split in 
two — the ground state therefore could not be integral, because even if the intrinsic angular momentum of the atoms 

were as small as possible, 1, the beam would be split into 3 parts, corresponding to atoms with L =-1,0, and +1. 

l z 

The conclusion is that silver atoms have net intrinsic angular momentum of Pauli set up a theory which explained 
this splitting by introducing a two-component wave function and a corresponding correction term in the 
Hamiltonian, representing a semi-classical coupling of this wave function to an applied magnetic field, as so: 

2 



Here A** is the applied electromagnetic field, and the three sigmas are Pauli matrices, eis the charge of the 
particle, e.g. e = — eofor the electron. On squaring out the first term, a residual interaction with the magnetic field 
is found, along with the usual Hamiltonian of a charged particle interacting with an applied field: 

H = — [p- -A) + eA 0 - 



2m V c ) 2mc 
This Hamiltonian is now a 2 x 2 matrix, so the Schrodinger equation based on it, 

dd 

H<p = ih-£ 

must use a two-component wave function. Pauli had introduced the sigma matrices 




0 1\ (0 -i 
i 0 




as pure phenomenology — Dirac now had a theoretical argument that implied that spin was somehow the 
consequence of the marriage of quantum theory to relativity. 

The Pauli matrices share the same properties as the Dirac matrices — they are all Hermitian, square to 1, and 
anticommute. This allows one to immediately find a representation of the Dirac matrices in terms of the Pauli 
matrices: 



(0 <Tk\ 
' \<Tk 0 J 



The Dirac equation now may be written as an equation coupling two-component spinors: 



/ mc 2 ca-p\f <(> + \ _ d_ ( 0+\ 
\ca ■ p -m<?) \<t>_) ~ m dt \4>-J ■ 





Notice that on the diagonal we find the rest energy of the particle. If we set the momentum to zero — that is, bring the 
particle to rest — then we have 

( mc 2 

= V 0 

The equations for the individual two-spinors are now decoupled, and we see that the "top" and "bottom" two-spinors 
are individually eigenf unctions of the energy with eigenvalues equal to plus and minus the rest energy, respectively. 
The appearance of this negative energy eigenvalue is completely consistent with relativity. 

It should be strongly emphasized that this separation in the rest frame is not an invariant statement — the "bottom" 
two-spinor does not represent antimatter as such in general. The entire four-component spinor represents an 
irreducible whole — in general, states will have an admixture of positive and negative energy components. If we 



Dirac equation 



29 



couple the Dirac equation to an electromagnetic field, as in the Pauli theory, then the positive and negative energy 
parts will be mixed together, even if they are originally decoupled. Dirac's main problem was to find a consistent 
interpretation of this mixing. As we shall see below, it brings a new phenomenon into physics — matter/antimatter 
creation and annihilation. 



Covariant form and relativistic invariance 

The explicitly covariant form of the Dirac equation is (employing the Einstein summation convention): 

—thrf'd^ + mcij) — 0. 
In the above, 7^ are the Dirac matrices. -y°is Hermitian, and the j k are anti-Hermitian, with the definition 

7°=/? 



7 = 7 a . 



Thus 



7 = 



/ 1 0 

0 1 

0 0 

\ 0 0 



0 
0 

-1 

0 



\ 



0 
0 
0 

- 1 / 



,7 = 



/ 0 

0 
0 

V-i 



0 
0 

-1 
0 



0 
1 
0 
0 



0 
0 

0/ 



,7 2 = 



/ 0 
0 
0 



0 0 

0 i 

1 0 
0 0 



0 
0 
0 



/ 



,7 = 



\ 



0 
0 

-1 

0 



This may be summarized using the Minkowski metric on spacetime in the form 

where the bracket expression {a, b} means ab + ba , the anticommutator. These are the defining relations of a 

Clifford algebra over a pseudo-orthogonal 4-d space with metric signature (-| ) . (Note that one may also 

employ the metric form ( 1- ++) by multiplying all the gammas by a factor of { .) The specific Clifford algebra 

employed in the Dirac equation is known as the Dirac algebra. 

The Dirac equation may be interpreted as an eigenvalue expression, where the rest mass is proportional to an 
eigenvalue of the 4-momentum operator, the proportion being the speed of light in vacuo: 

In practice, physicists often use units of measure such that ft and c are equal to 1, known as "natural" units. The 
equation then takes the simple form 

(-ij fA d fl + m)i/; = 0 
or, if Feynman slash notation is employed, 

(-i$ + m)ip = 0. 

A fundamental theorem states that if two distinct sets of matrices are given that both satisfy the Clifford relations, 
then they are connected to each other by a similarity transformation: 

y = s~ l ^s. 

If in addition the matrices are all unitary, as are the Dirac set, then S itself is unitary; 

y = ifl<fu. 

The transformation U is unique up to a multiplicative factor of absolute value 1. Let us now imagine a Lorentz 
transformation to have been performed on the derivative operators, which form a covariant vector. For the operator 
7^ dp to remain invariant, the gammas must transform among themselves as a contravariant vector with respect to 
their spacetime index. These new gammas will themselves satisfy the Clifford relations, because of the orthogonality 
of the Lorentz transformation. By the fundamental theorem, we may replace the new set by the old set subject to a 
unitary transformation. In the new frame, remembering that the rest mass is a relativistic scalar, the Dirac equation 
will then take the form 



0 1 

0 0 

0 0 

1 0 



Dirac equation 



30 



(-irf'fU&p + ™)ip(x', t') = 0 
U^-i-ffy + m)U^(x',t') = 0. 

If we now define the transformed spinor 

if/ = Uip 

then we have the transformed Dirac equation 

H 7 ^ + m)VV, t f ) = 0. 
Thus, once we settle on a unitary representation of the gammas, it is final provided we transform the spinor 
according the unitary transformation that corresponds to the given Lorentz transformation. 

These considerations reveal the origin of the gammas in geometry, hearkening back to Grassmann's original 
motivation - they represent a fixed basis of unit vectors in spacetime. Similarly, products of the gammas such as 
7m T^represent oriented surface elements, and so on. With this in mind, we can find the form the unit volume 
element on spacetime in terms of the gammas as follows. By definition, it is 

v = ^wtVtV • 

For this to be an invariant, the epsilon symbol must be a tensor, and so must contain a factor of yjg , where g is the 
determinant of the metric tensor. Since this is negative, that factor is imaginary. Thus 
V = ij 7 7 7 . 

This matrix is given the special symbol 75, owing to its importance when one is considering improper 
transformations of spacetime, that is, those that change the orientation of the basis vectors. In the representation we 
are using for the gammas, it is 

75 = (/ 2 o 2 )- 

Also note that we could as easily have taken the negative square root of the determinant of g - the choice amounts to 
an initial handedness convention. 

Lorentz Invariance of the Dirac equation 

The Lorentz invariance of the Dirac equation follows from its co variant nature. 

Comparison with the Klein-Gordon equation 

Using the Feynman slash notation, the Klein-Gordon equation can be factored: 

0 = (d 2 + m 2 )^ = (f + m 2 )ip = (i$ + m)(-i$ + m)^ . 
The last factor is simply the Dirac equation. Hence any solution to the Dirac equation is automatically a solution to 
the Klein-Gordon equation: 

+ m)iP = 0^(d 2 + m 2 )V> = 0 . 
But the converse is not true; not all solutions to the Klein-Gordon equation solve the Dirac equation. 



Dirac equation 



31 



Adjoint equation and Dirac current 

By defining the adjoint spinor 

where ^ is the conjugate transpose of ip , and noticing that 

{7 M)t 7 o = yy, 

we obtain, by taking the Hermitian conjugate of the Dirac equation and multiplying from the right by -y 0 , the 
adjoint equation: 

j>{i-fdp + m) = 0 

where is understood to act to the left. Multiplying the Dirac equation by ^ from the left, and the adjoint 
equation by ij) from the right, and adding, produces the law of conservation of the Dirac current in covariant form: 
dp (f^ip) = 0. 

Now we see the great advantage of the first-order equation over the one Schrodinger had tried - this is the conserved 
current density required by relativistic in variance, only now its 0-component is positive definite: 

The Dirac equation and its adjoint are the Euler-Lagrange equations of the 4-d invariant action integral 
S = J Ld\ 

where d A x = dt dx dy dz , and the scalar L is the Dirac Lagrangian 

ih - 

L = m*H> - -(VYX^VO - (d^W) 
and for the purposes of variation, if) and are regarded as independent fields. The relativistic invariance also 
follows immediately from the variational principle. 

Coupling to an electromagnetic field 

To consider problems in which an applied electromagnetic field interacts with the particles described by the Dirac 
equation, one uses the correspondence principle, and takes over into the theory the corresponding expression from 
classical mechanics, whereby the total momentum of a charged particle in an external field is modified as so: 

P — > p — -A , 
c 

(where q is the charge of the particle; for example, q = — e < Ofor an electron). In natural units, the Dirac 
equation then takes the form 

+ iqAp) + m}^ = 0. 

The validity of this prescription has been confirmed experimentally with great precision. It is known as minimal 
coupling, and is found throughout particle physics. Indeed, while the introduction of the electromagnetic field in this 
way is essentially phenomenological in this context, it arises from a fundamental principle in quantum field theory. 

Now as stated above, the transformation U is defined only up to a phase factor e i0 . Also, the fundamental 
observable of the Dirac theory, the current, is unchanged if we multiply the field (which is not a wavefunction) by an 
arbitrary phase. Because the field is not a wavefunction, this phase invariance has a different physical meaning from 
the phase invariance of probability amplitudes. We may exploit this to get the form of the mutual interaction of a 
Dirac particle and the electromagnetic field, as opposed to simply considering a Dirac particle in an applied field, by 
assuming this arbitrary phase factor to depend continuously on position: 

Notice now that 



Dirac equation 



32 



To preserve minimal coupling, we must add to the potential a term proportional to the gradient of the phase. But we 
know from electrodynamics that this leaves the electromagnetic field itself invariant. The value of the phase is 
arbitrary, but not how it changes from place to place. This is the starting point of gauge theory, which is the main 
principle on which quantum field theory is based. The simplest such theory, and the one most thoroughly 
understood, is known as quantum electrodynamics. The equations of field theory thus have invariance under both 
Lorentz transformations and gauge transformations. 

Curved spacetime Dirac equation 

The Dirac equation can be written in curved spacetime using vierbein fields. Vierbeins describe a local frame that 
enables to define Dirac matrices at every point. Contracting these matrices with vierbeins give the right 
transformation properties. This way Dirac's equation takes the following form in curved spacetime ^ : 

-vfe^D^ + m^f = 0. 
Here e^is the vierbein and D^is the covariant derivative for fermion fields, defined as follows 

where ^]ac is the Lorentzian metric, a ab is the commutator of Dirac matrices: 



ab 1 a b 

and is the spin connection: 

where is the Christoffel symbol. Note that here, Latin letters denote the "Lorentzian" indices and Greek ones 
denote "Riemannian" indices. 

Physical interpretation 

The Dirac theory, while providing a wealth of information that is accurately confirmed by experiments, nevertheless 
introduces a new physical paradigm that appears at first difficult to interpret and even paradoxical. Some of these 
issues of interpretation must be regarded as open questions. Here we will see how the Dirac theory brilliantly 
answered some of the outstanding issues in physics at the time it was put forward, while posing others that are still 
the subject of debate. 

Identification of observables 

The critical physical question in a quantum theory is - what are the physically observable quantities defined by the 
theory? According to general principles, such quantities are defined by Hermitian operators that act on the Hilbert 
space of possible states of a system. The eigenvalues of these operators are then the possible results of measuring the 
corresponding physical quantity. In the Schrodinger theory, the simplest such object is the overall Hamiltonian, 
which represents the total energy of the system. If we wish to maintain this interpretation on passing to the Dirac 
theory, we must take the Hamiltonian to be 



H = 7 ° (mc 2 + c j2l k (Pk- ~A k ) c) + qA°. 

\ k=l C J 



This looks promising, because we see by inspection the rest energy of the particle and, in case A = 0> me energy 
of a charge placed in an electric potential qA°- What about the term involving the vector potential? In classical 
electrodynamics, the energy of a charge moving in an applied potential is 



Dirac equation 



33 



H = Cyj (p - ^A) 2 + m 2 c 2 + qA°. 

Thus the Dirac Hamiltonian is fundamentally distinguished from its classical counterpart, and we must take great 
care to correctly identify what is an observable in this theory. Much of the apparent paradoxical behavior implied by 
the Dirac equation amounts to a misidentification of these observables. Let us now describe one such effect, (cont'd) 

History 

Since the Dirac equation was originally invented to describe the electron, we will generally speak of "electrons" in 
this article. The equation also applies to quarks, which are also elementary spin-^ particles. A modified Dirac 
equation can be used to approximately describe protons and neutrons, which are not elementary particles (they are 
made up of quarks), but have a net spin of V2. Another modification of the Dirac equation, called the Majorana 
equation, is thought to describe neutrinos — also spin- 1 /! particles. 

The Dirac equation describes the probability amplitudes for a single electron. This is a single-particle theory; in other 
words, it does not account for the creation and destruction of the particles, and for the ultimate need to switch from 
the Dirac equation for wavefunctions to the physically distinct Dirac equation for fields. It gives a good prediction of 
the magnetic moment of the electron and explains much of the fine structure observed in atomic spectral lines. It also 
explains the spin of the electron. Two of the four solutions of the equation correspond to the two spin states of the 
electron. The other two solutions make the peculiar prediction that there exist an infinite set of quantum states in 
which the electron possesses negative energy. This strange result led Dirac to predict, via a remarkable hypothesis 
known as "hole theory," the existence of particles behaving like positively-charged electrons. Dirac thought at first 
these particles might be protons. He was chagrined when the strict prediction of his equation (which actually 
specifies particles of the same mass as the electron) was verified by the discovery of the positron in 1932. When 
asked later why he hadn't actually boldly predicted the yet unfound positron with its correct mass, Dirac answered 
"Pure cowardice!" He shared the Nobel Prize anyway, in 1933. 

A similar equation for spin 3/2 particles is called the Rarita-Sch winger equation. 

Hole theory 

The negative E solutions found in the preceding section are problematic, for it was assumed that the particle has a 
positive energy. Mathematically speaking, however, there seems to be no reason for us to reject the negative-energy 
solutions. Since they exist, we cannot simply ignore them, for once we include the interaction between the electron 
and the electromagnetic field, any electron placed in a positive-energy eigenstate would decay into negative-energy 
eigenstates of successively lower energy by emitting excess energy in the form of photons. Real electrons obviously 
do not behave in this way. 

To cope with this problem, Dirac introduced the hypothesis, known as hole theory, that the vacuum is the 
many-body quantum state in which all the negative-energy electron eigenstates are occupied. This description of the 
vacuum as a "sea" of electrons is called the Dirac sea. Since the Pauli exclusion principle forbids electrons from 
occupying the same state, any additional electron would be forced to occupy a positive-energy eigenstate, and 
positive-energy electrons would be forbidden from decaying into negative-energy eigenstates. 

Dirac further reasoned that if the negative-energy eigenstates are incompletely filled, each unoccupied eigenstate - 
called a hole - would behave like a positively charged particle. The hole possesses a positive energy, since energy is 
required to create a particle-hole pair from the vacuum. As noted above, Dirac initially thought that the hole might 
be the proton, but Hermann Weyl pointed out that the hole should behave as if it had the same mass as an electron, 
whereas the proton is over 1800 times heavier. The hole was eventually identified as the positron, experimentally 
discovered by Carl Anderson in 1932. 



Dirac equation 



34 



It is not entirely satisfactory to describe the "vacuum" using an infinite sea of negative-energy electrons. The 
infinitely negative contributions from the sea of negative-energy electrons has to be canceled by an infinite positive 
"bare" energy and the contribution to the charge density and current coming from the sea of negative-energy 
electrons is exactly canceled by an infinite positive "jellium" background so that the net electric charge density of the 
vacuum is zero. In quantum field theory, a Bogoliubov transformation on the creation and annihilation operators 
(turning an occupied negative-energy electron state into an unoccupied positive energy positron state and an 
unoccupied negative-energy electron state into an occupied positive energy positron state) allows us to bypass the 
Dirac sea formalism even though, formally, it is equivalent to it. 

In certain applications of condensed matter physics, however, the underlying concepts of "hole theory" are valid. The 
sea of conduction electrons in an electrical conductor, called a Fermi sea, contains electrons with energies up to the 
chemical potential of the system. An unfilled state in the Fermi sea behaves like a positively-charged electron, 
though it is referred to as a "hole" rather than a "positron". The negative charge of the Fermi sea is balanced by the 
positively-charged ionic lattice of the material. 

Dirac bilinears 

There are five different (neutral) Dirac bilinear terms not involving any derivatives: 

• (S)calar: (scalar, P-even) 

• (P)seudoscalar: ^^ip (scalar, P-odd) 

• (V)ector: ip^if) (vector, P-odd) 

• (A)xial: ^^^y 5 ^ (vector, P-even) 

• (T)ensor: ^a^ijj (antisymmetric tensor, P-even), 

where G w = % - [y, y]and 7 5 = 75 = ^c^atVVT* = hVtV- 

A Dirac mass term is an S coupling. A Yukawa coupling may be S or P. The electromagnetic coupling is V. The 
weak interactions are V-A. 

See also 

• Bohr-Sommerfeld theory 

• Breit equation 

• Dirac field 

• Einstein-Maxwell-Dirac equations 

• Feynman checkerboard 

• Foldy-Wouthuysen transformation 

• Klein-Gordon equation 

• Quantum electrodynamics 

• Rarita-Schwinger equation 

• Theoretical and experimental justification for the Schrodinger equation 

• The Dirac Equation appears on the floor of Westminster Abbey. It appears on the plaque commemorating Paul 
Dirac's life which was inaugurated on November 13, 1995 . 



Dirac equation 



35 



References 

[1] Lawrie, Ian D.. A Unified Grand Tour of Theoretical Physics. 
[2] http : / / www . dirac . ch/PaulDirac . html 

Selected papers 

• P.A.M. Dirac "The Quantum Theory of the Electron", Proc. R. Soc. A (1928) vol. 1 17, no 778, 610-624 (http:// 
dx.doi.org/10. 1098/rspa. 1928.0023) 

• P.A.M. Dirac "The Quantum Theory of the Electron", Proc. R. Soc. A117) (http://gallica.bnf.fr/ark:/12148/ 
bpt6k562109) link to the volume of the Proceedings of the Royal Society of London containing the article at page 
610 

• P.A.M. Dirac "A Theory of Electrons and Protons", Proc. R. Soc. A126) (http://gallica.bnf.fr/ark:/12148/ 
bpt6k56219d/f388.table) link to the volume of the Proceedings of the Royal Society of London containing the 
article at page 360 

• CD. Anderson, Phys. Rev. 43, 491 (1933) 

• R. Frisch and O. Stern, Z. Phys. 85, 4 (1933) 

• R. Chen, New Exact Solution of Dirac-Coulomb Equation with Exact Boundary Condition. International Journal 
of Theoretical Physics 47, 881 (2008). 

Textbooks 

• Halzen, Francis; Martin, Alan (1984). Quarks & Leptons: An Introductory Course in Modern Particle Physics. 
John Wiley & Sons. ISBN. 

• Dirac, P.A.M., Principles of Quantum Mechanics, 4th edition (Clarendon, 1982) 

• Shankar, R., Principles of Quantum Mechanics, 2nd edition (Plenum, 1994) 

• Bjorken, J D & Drell, S, Relativistic Quantum mechanics 

• Thaller, B., The Dirac Equation, Texts and Monographs in Physics (Springer, 1992) 

• Schiff, L.I., Quantum Mechanics, 3rd edition (McGraw-Hill, 1968) 

• Griffiths, D.J., Introduction to Elementary Particles, 2nd edition (Wiley-VCH, 2008) ISBN 978-3-527-40601-2. 

External links 

• The Dirac Equation (http://www.mathpages.com/home/kmath654/kmath654.htm) at MathPages 

• The Nature of the Dirac Equation, its solutions and Spin (http://www.mc.maricopa.edu/~kevinlg/i256/ 
Nature_Dirac . pdf) 

• Dirac equation for a spin Vi particle (http://electron6.phys.utk.edu/qm2/modules/m9/dirac.htm) 

• Unitary Principle and Mechanical Non-solution Proving of Neomorphic Dirac Equation (http://commons. 
wikimedia. org/wiki/ 

File:Unitary_Principle_and_Mechanical_Non-solution_Proving_of_Neomorphic_Dirac_Equation.pdf) 

• Pedagogic Aids to Quantum Field Theory (http://www.quantumfieldtheory.info) click on Chap. 4 for a 
step-by-small-step introduction to the Dirac equation, spinors, and relativistic spin/helicity operators. 



Klein-Gordon equation 



36 



Klein-Gordon equation 

The Klein-Gordon equation (Klein-Fock-Gordon equation or sometimes Klein-Gordon-Fock equation) is a 
relativistic version of the Schrodinger equation. 

It is the equation of motion of a quantum scalar or pseudoscalar field, a field whose quanta are spinless particles. It 
cannot be straightforwardly interpreted as a Schrodinger equation for a quantum state, because it is second order in 
time and because it does not admit a positive definite conserved probability density. Still, with the appropriate 
interpretation, it does describe the quantum amplitude for finding a point particle in various places, the relativistic 
wavefunction, but the particle propagates both forwards and backwards in time. Any solution to the Dirac equation is 
automatically a solution to the Klein-Gordon equation, but the converse is not true. 



Statement 

The Klein-Gordon equation is 



,2 J2 



?5? *-vV+- jr * = a 

It is most often written in natural units: 

The form is determined by requiring that plane wave solutions of the equation: 

obey the energy momentum relation of special relativity: 

- PflP r = E 2 -P 2 = u; 2 -k 2 = -k^ = m 2 
Unlike the Schrodinger equation, there are two values of uj for each k, one positive and one negative. Only by 
separating out the positive and negative frequency parts does the equation describe a relativistic wavefunction. For 
the time-independent case, the Klein-Gordon equation becomes 



v 2 - 



m 2 c 2 



V>(r) = 0 



which is the homogeneous screened Poisson equation. 



History 

The equation was named after the physicists Oskar Klein and Walter Gordon, who in 1927 proposed that it describes 
relativistic electrons. Although it turned out that the Dirac equation describes the spinning electron, the Klein 
Gordon equation correctly describes the spinless pion. The pion is a composite particle; no spinless elementary 
particles have yet been found, although the Higgs boson is theorized to exist as a spin-zero boson, according to the 
Standard Model. 

The Klein-Gordon equation was first considered as a quantum wave equation by Schrodinger in his search for an 
equation describing de Broglie waves. The equation is found in his notebooks from late 1925, and he appears to have 
prepared a manuscript applying it to the hydrogen atom. Yet, without taking into account the electron's spin, the 
Klein-Gordon equation predicts the hydrogen atom's fine structure incorrectly, including overestimating the overall 
magnitude of the splitting pattern by a factor of 4n/(2n — l)for the ft-th energy level. In January 1926, 
Schrodinger submitted for publication instead his equation, a non-relativistic approximation that predicts the Bohr 
energy levels of hydrogen without fine structure. 

In 1926, soon after the Schrodinger equation was introduced, Vladimir Fock wrote an article about its generalization 
for the case of magnetic fields, where forces were dependent on velocity, and independently derived this equation. 



Klein-Gordon equation 



37 



Both Klein and Fock used Kaluza and Klein's method. Fock also determined the gauge theory for the wave equation. 
The Klein-Gordon equation for a free particle has a simple plane wave solution. 

Derivation 

The non-relativistic equation for the energy of a free particle is 

p 2 

2m 

By quantizing this, we get the non-relativistic Schrodinger equation for a free particle, 
p 2 d 

where 

p = —ihV 

is the momentum operator ( \7 being the del operator). 

The Schrodinger equation suffers from not being relativistically covariant, meaning it does not take into account 
Einstein's special relativity. 

It is natural to try to use the identity from special relativity 
Vp 2 c 2 + m 2 c 4 = E 

for the energy; then, just inserting the quantum mechanical momentum operator, yields the equation 

d 



ut 

This, however, is a cumbersome expression to work with because the differential operator cannot be evaluated while 
under the square root sign. In addition, this equation, as it stands, is nonlocal. 

Klein and Gordon instead began with the square of the above identity, i.e. 

p 2 c 2 + m 2 c 4 = £ 2 
which, when quantized, gives 

{{-iHVfc 2 + m 2 c 4 )V = (ift^)V 

ot 

which simplifies to 

(dty 

Rearranging terms yields 



i a 2 



$ - v V + '^^^ = o 



2^2 



m c 



c 2 (dt) 2 ^ r h 2 

Since all reference to imaginary numbers has been eliminated from this equation, it can be applied to fields that are 
real valued as well as those that have complex values. 

Using the reciprocal of the Minkowski metric diag(— c 2 , 1, 1, 1) , we get 

777 2 f 2 

-rTd^iP + — = 0 
a 

in covariant notation. This is often abbreviated as 

{□ + ^ = o, 

where 

rnc 



Klein-Gordon equation 



38 



and 

n 1 a 2 - 
□ = v . 

c 2 dt 2 

This operator is called the d'Alembert operator. Today this form is interpreted as the relativistic field equation for a 
scalar (i.e. spin-0) particle. Furthermore, any solution to the Dirac equation (for a spin-one-half particle) is 
automatically a solution to the Klein-Gordon equation, though not all solutions of the Klein-Gordon equation are 
solutions of the Dirac equation. It is noteworthy that the Klein-Gordon equation is very similar to the Proca 
equation. 

Relativistic free particle solution 

The Klein-Gordon equation for a free particle can be written as 
with the same solution as in the non-relativistic case: 

V>(r, t) = e^ 1 -^ 
except with the constraint 

2 u 2 m 2 c 2 



c 2 h 2 " 

Just as with the non-relativistic particle, we have for energy and momentum: 

< P > = (VI - ihVty) = fik, 
(E) = (i>\ih^\l>) = hw. 

Except that now when we solve for k and oo and substitute into the constraint equation, we recover the relationship 
between energy and momentum for relativistic massive particles: 

(E) 2 = m 2 c 4 + <p) 2 c 2 . 

For massless particles, we may set m = 0 in the above equations. We then recover the relationship between energy 
and momentum for massless particles: 

(E) = (\p\)c 
Action 

The Klein-Gordon equation can also be derived from the following action 

where ij) is the Klein-Gordon field and m is its mass. The complex conjugate of ip is written ^.If the scalar 
field is taken to be real- valued, then ^ = ijj. 

From this we can derive the stress-energy tensor of the scalar field. It is 
i-2 



Klein-Gordon equation 



39 



Electromagnetic interaction 

There is a simple way to make any field interact with electromagnetism in a gauge invariant way: replace the 
derivative operators with the gauge co variant derivative operators. The Klein Gordon equation becomes: 

=-{d t - ieA 0 ) 2 (f> + {di - %eAif<f> = m 2 <f> 
in natural units, where A is the vector potential. While it is possible to add many higher order terms, for example, 

D^D^tp + AF» v D^D v {D a D a $) = 0 

these terms are not renormalizable in 3+1 dimensions. 

The field equation for a charged scalar field multiplies by i, which means the field must be complex. In order for a 
field to be charged, it must have two components that can rotate into each other, the real and imaginary parts. 

The action for a charged scalar is the covariant version of the uncharged action: 
S = f (drf* + ieA^*)(d v <S> - ieA v <\>)-rf v = f |D0| 2 

J x J X 

Gravitational interaction 

In general relativity, we include the effect of gravity and the Klein-Gordon equation becomes 
-1 m 2 c 2 

or equivalently 

^ 2 Jl 

o = -srvjjrf + 

id 2 r 2 
ft 

where g a P is the reciprocal of the metric tensor that is the gravitational potential field, 9 is the determinant of the 
metric tensor, is the covariant derivative and T a ^ is the Christoffel symbol that is the gravitational force field. 

References 

• Sakurai, J. J. (1967). Advanced Quantum Mechanics. Addison Wesley. ISBN 0-201-06710-2. 

• Davydov, A.S. (1976). Quantum Mechanics, 2nd Edition. Pergamon. ISBN 0-08-020437-6. 



External links 

• Weisstein, Eric W., "Klein-Gordon equation from Math World. 

• Linear Klein-Gordon Equation at EqWorld: The World of Mathematical Equations. 

• Nonlinear Klein-Gordon Equation at EqWorld: The World of Mathematical Equations. 



References 

[ 1 ] http : / / math world, wolfram. com/Klein-GordonEquation . html 
[2] http://eqworld.ipmnet.ru/en/solutions/lpde/lpde203.pdf 
[3 1 http : / / eqworld . ipmnet . ru/ en/ solutions/ npde/ npde2 1 07 . pdf 



Einstein-Maxwell-Dirac equations 



40 



Einstein-Maxwell-Dirac equations 

Einstein-Maxwell-Dirac equations (EMD) are related to quantum field theory. The current Big Bang Model is a 
quantum field theory in a curved spacetime. Unfortunately, no such theory is mathematically well-defined; in spite 
of this, theoreticians claim to extract information from this hypothetical theory. On the other hand, the 
super-classical limit of the not mathematically well-defined QED in a curved spacetime is the mathematically 
well-defined Einstein-Maxwell-Dirac system. (One could get a similar system for the standard model.) As a super 
theory, EMD violates the positivity condition in the Penrose-Hawking Singularity Theorem. Thus, it is possible that 
there would be complete solutions without any singularities- Yau has in fact constructed some. Furthermore, it is 
known that the Einstein-Maxwell-Dirac system admits of solitonic solutions, i.e., classical electrons and photons. 
This is the kind of theory Einstein was hoping for. EMD is also a totally geometricized theory as a non-commutative 
geometry; here, the charge e and the mass m of the electron are geometric invariants of the non-commutative 
geometry analogous to pi. 

One way of trying to construct a rigorous QED and beyond is to attempt to apply the deformation quantization 
program to MD, and more generally, EMD. This would involve the following. 

Program for SCESM 

The Super-Classical Einstein-Standard Model: 

• 1. Extend Asymptotic Completeness, Global Existence and the Infrared Problem for the Maxwell-Dirac Equations 
to SCESM (Memoirs of the American Mathematical Society), by M. Flato, Jacques C. H. Simon, Erik Taflin 
(http://www.amazon.com/Asymptotic-Completeness-Existence-Maxwell-Dirac-Mathematical/dp/ 
0821806831/ref=sr_l_l?ie=UTF8&s=books&qid=1240926988&sr=l-l). 

• 2. Show that the positivity condition in the Penrose-Hawking singularity theorem is violated for the SCESM. 
Construct smooth solutions to SCESM having Dark Stars. See here: The Large Scale Structure of Space-Time by 
Stephen W. Hawking, G. F. R. Ellis (http://www.amazon.com/ 

Structure-Space-Time-Cambridge-Monographs-Mathematical/dp/0521099064/ref=pd_bbs_sr_lie=UTF8& 
s=books&qid=1240927769&sr=8-l) 

• 3. Follow three substeps 

• i. Derive approximate history of the universe from SCESM - both analytically and via computer simulation. 

• ii. Compare with ESM (the QSM in a curved space-time). 

• iii. Compare with observation. See: Cosmology by Steven Weinberg (http://www.amazon.com/ 
Cosmology-StevenWeinberg/dp/O^ 

• 4. Show that the solution space to SCESM, F, is a reasonable infinite dimensional super- sympletic manifold. See: 
Supersymmetry for Mathematicians: An Introduction (http://www.amazon.com/ 

Supersymmetry-Mathematicians-Introduction-Courant-Lecture/dp/082 1 835742/ref=sr_l_5 ?ie=UTF8& 
s=books&qid= 1 2408940 1 1 &sr= 1 -5 ) 

• 5. Apply deformation quantization to F to obtain mathematically rigorous definition of SQESM (quantum version 
of SCESM). See: 

Deformation Theory and Symplectic Geometry by Daniel Sternheimer, John Rawnsley (http://www.amazon.com/ 
Deformation-Symplectic-Geometry-Mathematical-Physics/ dp/ 0792345258/ ref=sr_l_l?ie=UTF8& s=books& 
qid=1240930131&sr=l-l) 

• 6. Derive history of the universe from SQESM and compare with observation. 



Einstein-Maxwell-Dirac equations 



41 



References 

• http://arxiv.org/PS_cache/gr-qc/pdf/9801/9801079v3.pdf 

• http://arxiv.org/PS_cache/gr-qc/pdf/98 10/98 10048v4.pdf 

• http://arxiv.org/PS_cache/gr-qc/pdf/0005/0005028v3.pdf 

• http://deepblue.lib.urnich.edU/bitstream/2027.42/49217/2/cqg6_13_009.pdf 

• http://arxiv.org/PS_cache/gr-qc/pdf/99 10/99 10047v2.pdf 

• http://arxiv.org/PS_cache/gr-qc/pdf/98 10/98 10048v4.pdf 

• http://arxiv.org/PS_cache/hep-th/pdf/0608/0608221v2.pdf 

• http://arxiv.org/PS_cache/hep-th/pdf/0608/0608226v2.pdf 

• Varadarajan, V. S. (2004). Super symmetry for Mathematicians: An Introduction. Courant Lecture Notes in 
Mathematics 11. American Mathematical Society. ISBN 0-8218-3574-2. 

• Deligne, Pierre (1999). Quantum Fields and Strings: A Course for Mathematicians. 1. American Mathematical 
Society. ISBN 0-8218-2012-5. 

• Deligne, Pierre (1999). Quantum Fields and Strings: A Course for Mathematicians. 2. American Mathematical 
Society. ISBN 0-8218-2012-5. 



Rigged Hilbert space 

In mathematics, a rigged Hilbert space (Gelfand triple, nested Hilbert space, equipped Hilbert space) is a 

construction designed to link the distribution and square-integrable aspects of functional analysis. Such spaces were 
introduced to study spectral theory in the broad sense. They can bring together the 'bound state' (eigenvector) and 
'continuous spectrum', in one place. 



Motivation 

Since a function such as 
x h-> e ix > 

which is in an obvious sense an eigenvector of the differential operator 

. d 
dx 

on the real line R, is not square-integrable for the usual Borel measure on R, this requires some way of stepping 
outside the strict confines of the Hilbert space theory. This was supplied by the apparatus of Schwartz distributions, 
and a generalized eigenfunction theory was developed in the years after 1950. 



Functional analysis approach 

The concept of rigged Hilbert space places this idea in abstract functional- analytic framework. Formally, a rigged 
Hilbert space consists of a Hilbert space H, together with a subspace O which carries a finer topology, that is one for 
which the natural inclusion 

$ C H 

is continuous. It is no loss to assume that O is dense in H for the Hilbert norm. We consider the inclusion of dual 
spaces H in O . The latter, dual to O in its 'test function' topology, is realised as a space of distributions or 
generalised functions of some sort, and the linear functionals on the subspace O of type 

<t> ^ <V 3 0) 

for v in H are faithfully represented as distributions (because we assume O dense). 



Rigged Hilbert space 



42 



Now by applying the Riesz representation theorem we can identify H with H. Therefore the definition of rigged 
Hilbert space is in terms of a sandwich: 

$ C H C ©*. 

The most significant examples are for which O is a nuclear space; this comment is an abstract expression of the idea 
that O consists of test functions and O* of the corresponding distributions. 

Formal definition (Gelfand triple) 

A rigged Hilbert space is a pair (H,0) with H a Hilbert space, O a dense subspace, such that O is given a 
topological vector space structure for which the inclusion map i is continuous. Identifying H with its dual space H* 9 
the adjoint to i is the map i* \ H = H* — > $* • The duality pairing between O and O has to be compatible with 
the inner product on H. (u,v)<f> x $>* = (it, v) H whenever u G <$> C if and y £ H = H* C <£* • 
The specific triple i/ ; $*} is often named the "Gelfand triple" (after the mathematician Israel Gelfand). 

Note that even though O is isomorphic to O if O is a Hilbert space in its own right, this isomorphism is not the same 
as the composition of the inclusion / with its adjoint /* 

References 

• J.-P. Antoine, Quantum Mechanics Beyond Hilbert Space (1996), appearing in Irreversibility and Causality, 
Semigroups and Rigged Hilbert Spaces, Arno Bohm, Heinz-Dietrich Doebner, Piotr Kielanowski, eds., 
Springer- Verlag, ISBN 3-540-64305-2. (Provides a survey overview.) 

• Jean Dieudonne, Elements d 'analyse VII (1978). (See paragraphs 23.8 and 23.32) 

• I. M. Gelfand and N. J. Vilenkin. Generalized Functions, vol. 4: Some Applications of Harmonic Analysis. 
Rigged Hilbert Spaces. Academic Press, New York, 1964. 

• R. de la Madrid, "The role of the rigged Hilbert space in Quantum Mechanics," Eur. J. Phys. 26, 287 (2005); 
quant-ph/0502053 

• K. Maurin, Generalized Eigenfunction Expansions and Unitary Representations of Topological Groups, Polish 
Scientific Publishers, Warsaw, 1968. 

• Minlos, R.A. (2001), "Rigged Hilbert space" , in Hazewinkel, Michiel, Encyclopaedia of Mathematics, 
Springer, ISBN 978-1556080104 

References 

[ 1 ] http://arxiv.org/abs/quant-ph/050205 3 
[2] http://eom. springer. de/r/r082340.htm 



Quantum inverse scattering method 



43 



Quantum inverse scattering method 



Quantum inverse scattering method relates two different approaches: l)Inverse scattering transform is a method of 
solving classical integrable differential equations of evolutionary type. Important concept is Lax representation. 2) 
Bethe ansatz is a method of solving quantum models in one space and one time dimension. Quantum inverse 
scattering method starts by quantization of Lax representation and reproduce results of Bethe ansatz. Actually it 
permits to rewrite Bethe ansatz in a new form: algebraic Bethe ansatz. This led to further progress in understanding 
of Heisenberg model (quantum), quantum Nonlinear Schrodinger equation and Hubbard model. 

In mathematics, the quantum inverse scattering method is a method for solving integrable models in 1+1 
dimensions introduced by L. D. Faddeev in about 1979. 

References 

• Faddeev, L. (1995), "Instructive history of the quantum inverse scattering method" Acta Applicandae 
Mathematical 39 (1): 69-84, MR1329554, ISSN 0167-8019 

• Korepin, V. E.; Bogoliubov, N. M.; Izergin, A. G. (1993), Quantum inverse scattering method and correlation 

T21 

functions \ Cambridge Monographs on Mathematical Physics, Cambridge University Press, MR1245942, 
ISBN 978-0-521-37320-3 

References 

[1] http://dx.doi.org/10. 1007/BF00994626 

[2] http://www. Cambridge. org/catalogue/catalogue.asp?isbn=978052 1586467 



A quasi-Hopf algebra is a generalization of a Hopf algebra, which was defined by the Russian mathematician 
Vladimir Drinfeld in 1989. 



Quasi-Hopf algebra 



A quasi-Hopf algebra is a quasi-bialgebra Bj± = (w4. 
antihomomorphism S (antipode) of J[ such that 



., A, £, <I>) for which there exist a, (3 £ A. and a bijective 




for all a £ A. an d where 



and 



Y / S(P j )aQ j /3S(R j )=I. 



3 



where the expansions for the quantities $and c£ -1 are given by 



and 



Quasi-Hopf algebra 



44 



* _1 = E^®Qi®^ 

3 

As for a quasi-bialgebra, the property of being quasi-Hopf is preserved under twisting. 

Usage 

Quasi-Hopf algebras form the basis of the study of Drinfeld twists and the representations in terms of F-matrices 
associated with finite-dimensional irreducible representations of quantum affine algebra. F-matrices can be used to 
factorize the corresponding R-matrix. This leads to applications in Statistical mechanics, as quantum affine algebras, 
and their representations give rise to solutions of the Yang-Baxter equation, a solvability condition for various 
statistical models, allowing characteristics of the model to be deduced from its corresponding quantum affine 
algebra. The study of F-matrices has been applied to models such as the Heisenberg XXZ model in the framework of 
the algebraic Bethe ansatz. It provides a framework for solving two-dimensional integrable models by using the 
Quantum inverse scattering method. 

References 

• Vladimir Drinfeld, Quasi-Hopf algebras, Leningrad Math J. 1 (1989), 1419-1457 

• J.M. Maillet and J. Sanchez de Santos, Drinfeld Twists and Algebraic Bethe Ansatz, Amer. Math. Soc. Transl. (2) 
Vol. 201, 2000 

Quasitriangular Hopf algebra 

In mathematics, a Hopf algebra, H, is quasitriangular^ if there exists an invertible element, R, of JJ ® H such 
that 

• R A(x) = (To R for all x £ H » where A is the coproduct on H, and the linear map 
T \H®H -> H®Hi& given by T(x ® y) = y <g> x , 

• (A ® l)(i?) = R 13 R 23 , 

• (1® A){R) = R 13 R 12 , 

where R l2 = <p 12 (R) , R 13 = <t>i 3 {R) , and R 23 = <j) 23 (R) , where 0 12 : H <g> H -> H <g> H ® H , 
013 :H®H^H(&H®H, and 023 : H ® H — > H ® H <E> H , are algebra morphisms determined 
by 

(fruia ® 6) = a ® b ® 1, 
0i 3 (a ® b) = a <g> 1 ® 6, 

023(a ® ft) = 1 ® a <8) 
is called the R-matrix. 

As a consequence of the properties of quasitriangularity, the R-matrix, R, is a solution of the Yang-Baxter equation 
(and so a module V of H can be used to determine quasi-invariants of braids, knots and links). Also as a consequence 
of the properties of quasitriangularity, (e ® = (1 <g> = 1 G ; moreover J? -1 = (£ ® 1)(J2), 
fl = (1 ® S^R- 1 ), and (5 ® = R . One may further show that the antipode S must be a linear 

isomorphism, and thus S A 2 is an automorphism. In fact, S A 2 is given by conjugating by an invertible element: 
S(x) = uxu -1 where u = m(S f <g> l)i? 21 (cf. Ribbon Hopf algebras). 

It is possible to construct a quasitriangular Hopf algebra from a Hopf algebra and its dual, using the Drinfel'd 
quantum double construction. 



Quasitriangular Hopf algebra 



45 



Twisting 

The property of being a quasi-triangular Hopf algebra is preserved by twisting via an invertible element 
F = ^1 / ® fi ^ A® A mat ^ e (g) id)p — ^oJ (g) e) F = land satisfying the cocycle condition 

i 

(F ® 1) o (A ® = (1 ® F) o (id <g> A)F 

Furthermore, " = is invertible and the twisted antipode is given by S'^a) = uS{a)u 1 , with the 

i 

twisted comultiplication, R-matrix and co-unit change according to those defined for the quasi-triangular Quasi-Hopf 
algebra. Such a twist is known as an admissible (or Drinfel'd) twist. 

Notes 

[1] Montgomery & Schneider (2002), p. 72 (http://books.google.com/books?id=I3IK9U5Co_0C&pg=PA72&dq=''Quasitriangular''). 

References 

• Susan Montgomery, Hans-Jurgen Schneider. New directions in Hopf algebras , Volume 43. Cambridge University 
Press, 2002. ISBN 9780521815123 

Ribbon Hopf algebra 

A ribbon Hopf algebra (^4 ? m, A, it, £, S : 7Z : i/)is a quasitriangular Hopf algebra which possess an invertible 
central element v more commonly known as the ribbon element, such that the following conditions hold: 
v 2 = uS{u), S(u) = v, e{v) = 1 

a(i/) = {n 21 n 12 )-\v®v) 

where u = m(S ® id) (7^21 )• Note that the element u exists for any quasitriangular Hopf algebra, and uS(u) 

must always be central and satisfies 

S(uS(u)) = uS(u),e(uS(u)) = l,A(uS(u)) = (n 2l n l2 )~ 2 (uS(u) ® uS(u))> so that all that is 

required is that it have a central square root with the above properties. 
Here 

A is a vector space 

m is the multiplication map 777, ; A ® A — > A 
A is the co-product map A : A — > A ® A 
u is the unit operator u : C — > A 
£ is the co-unit operator £ ; ^4 — > C 
5 is the antipode S : A ^ A 
is a universal R matrix 
We assume that the underlying field K is C 



Ribbon Hopf algebra 



46 



See also 

• Quasitriangular Hopf algebra 

• Quasi-triangular Quasi-Hopf algebra 

References 

• Altschuler, D., Coste, A.: Quasi-quantum groups, knots, three-manifolds and topological field theory. Commun. 
Math. Phys. 150 1992 83-107 http://arxiv.org/pdf/hep-th/9202047 

• Chari, V.C., Pressley, A.: A Guide to Quantum Groups Cambridge University Press, 1994 ISBN 0-521-55884-0. 

• Vladimir Drinfeld, Quasi-Hopf algebras, Leningrad Math J. 1 (1989), 1419-1457 

• Shahn Majid : Foundations of Quantum Group Theory Cambridge University Press, 1995 

Quasi-triangular Quasi-Hopf algebra 

A quasi-triangular quasi-Hopf algebra is a specialized form of a quasi-Hopf algebra defined by the Ukrainian 
mathematician Vladimir Drinfeld in 1989. It is also a generalized form of a quasi-triangular Hopf algebra. 

A quasi-triangular quasi-Hopf algebra is a set 7Y^4 = (*4. 5 i?, A, £, where Bj± = (*4 5 A, £, <1>) is a 

quasi-Hopf algebra and R £ j\ g) known as the R-matrix, is an invertible element such that 

RA(a) = a o A(a)R, a <E A 

a\A®A^A®A 

x ® y — > y <g> x 
so that a is the switch map and 

(A g> id)R = $32ltfl3*r32^23$123 

(id ® A)R = $^3 1 1 i?i3$213^12*r2 1 3 
where $ abc = x a <g> x b <g> x c and $ 123 = Q = Xi ® x 2 ® x$ £ A® A® A . 

The quasi-Hopf algebra becomes triangular if in addition, i? 2 i R\2 — 1 • 

The twisting of l~Lj^ by F £ ^4 ® A i s me same as for a quasi-Hopf algebra, with the additional definition of the 
twisted ^-matrix 

A quasi-triangular (resp. triangular) quasi-Hopf algebra with $ = lis a quasi-triangular (resp. triangular) Hopf 
algebra as the latter two conditions in the definition reduce the conditions of quasi-triangularity of a Hopf algebra . 

Similarly to the twisting properties of the quasi-Hopf algebra, the property of being quasi-triangular or triangular 
quasi-Hopf algebra is preserved by twisting. 

See also 

• Quasitriangular Hopf algebra 

• Ribbon Hopf algebra 

References 

• Vladimir Drinfeld, Quasi-Hopf algebras, Leningrad Math J. 1 (1989), 1419-1457 

• J.M. Maillet and J. Sanchez de Santos, Drinfeld Twists and Algebraic Bethe Ansatz, Amer. Math. Soc. Transl. (2) 
Vol. 201, 2000 



Grassmann algebra 



47 



Grassmann algebra 

In mathematics, the exterior product or wedge product of vectors is an algebraic construction generalizing certain 
features of the cross product to higher dimensions. Like the cross product, and the scalar triple product, the exterior 
product of vectors is used in Euclidean geometry to study areas, volumes, and their higher-dimensional analogs. 
Also, like the cross product, the exterior product is alternating, meaning that u a u = 0 for all vectors u, or 
equivalently^ u a v = -v a u for all vectors u and v. In linear algebra, the exterior product provides an abstract 
algebraic manner for describing the determinant and the minors of a linear transformation that is basis-independent, 
and is fundamentally related to ideas of rank and linear independence. 

The exterior algebra (also known as the Grassmann algebra, after Hermann Grassmann ) of a given vector 
space V over a field K is the unital associative algebra A(V) generated by the exterior product. It is widely used in 
contemporary geometry, especially differential geometry and algebraic geometry through the algebra of differential 
forms, as well as in multilinear algebra and related fields. In terms of category theory, the exterior algebra is a type 
of functor on vector spaces, given by a universal construction. The universal construction allows the exterior algebra 
to be defined, not just for vector spaces over a field, but also for modules over a commutative ring, and for other 
structures of interest. The exterior algebra is one example of a bialgebra, meaning that its dual space also possesses a 
product, and this dual product is compatible with the wedge product. This dual algebra is precisely the algebra of 
alternating multilinear forms on V, and the pairing between the exterior algebra and its dual is given by the interior 
product. 



Motivating examples 
Areas in the plane 

2 

The Cartesian plane R is a vector space equipped with a basis 
consisting of a pair of unit vectors 



(a+c,b+d) 




The area of a parallelogram in terms of the 
determinant of the matrix of coordinates of two of 
its vertices. 



ex = (1,0), e 2 = (0,l). 
Suppose that 

v = viei + V2&2>> w = wi&i + w 2 e 2 

2 

are a pair of given vectors in R , written in components. There is a unique parallelogram having v and w as two of its 
sides. The area of this parallelogram is given by the standard determinant formula: 



Grassmann algebra 



48 



A = |det [v w] | = \v1W2 — v 2 w 1 \. 
Consider now the exterior product of v and w: 

v A w = (uiei + v 2 e 2 ) A (u>iei + w 2 e 2 ) 

= viwiei A ei + viw 2 ei A e 2 + v 2 wie 2 A ei + v 2 w 2 e 2 A e 2 

= (v 1 w 2 — ^2^1)^1 A e 2 

where the first step uses the distributive law for the wedge product, and the last uses the fact that the wedge product 
is alternating, and in particular e 2 a e 1 = -e 1 a e^. Note that the coefficient in this last expression is precisely the 
determinant of the matrix [v w]. The fact that this may be positive or negative has the intuitive meaning that v and w 
may be oriented in a counterclockwise or clockwise sense as the vertices of the parallelogram they define. Such an 
area is called the signed area of the parallelogram: the absolute value of the signed area is the ordinary area, and the 
sign determines its orientation. 

The fact that this coefficient is the signed area is not an accident. In fact, it is relatively easy to see that the exterior 
product should be related to the signed area if one tries to axiomatize this area as an algebraic construct. In detail, if 
A(v,w) denotes the signed area of the parallelogram determined by the pair of vectors v and w, then A must satisfy 
the following properties: 

1. A(a\,bw) = ab A(v,w) for any real numbers a and b, since rescaling either of the sides rescales the area by the 
same amount (and reversing the direction of one of the sides reverses the orientation of the parallelogram). 

2. A(v,v) = 0, since the area of the degenerate parallelogram determined by v (i.e., a line segment) is zero. 

3. A(w,v) = -A(v,w), since interchanging the roles of v and w reverses the orientation of the parallelogram. 

4. A(v + <zw,w) = A(v,w), since adding a multiple of w to v affects neither the base nor the height of the 
parallelogram and consequently preserves its area. 

5. A(e , e ) = 1, since the area of the unit square is one. 

With the exception of the last property, the wedge product satisfies the same formal properties as the area. In a 

certain sense, the wedge product generalizes the final property by allowing the area of a parallelogram to be 

compared to that of any "standard" chosen parallelogram. In other words, the exterior product in two-dimensions is a 

T31 

basis-independent formulation of area. 

Cross and triple products 

For vectors in R , the exterior algebra is closely related to the cross product and triple product. Using the standard 
basis {e 1? e 2 , e 3 }, the wedge product of a pair of vectors 

u = itiei + u 2 e 2 + u 3 e 3 

and 

v = vi&i + v 2 e 2 + v 3 e 3 

is 

u A v = {uiv 2 - u 2 v 1 )(e 1 A e 2 ) + (u 3 v 1 - uiv 3 ) (e 3 A + (u 2 v 3 - u 3 v 2 )(e 2 A e 3 ) 

2 3 

where {e 1 A e 2 , e 3 A e 2 A e 3 } is the basis for the three-dimensional space A (R ). This imitates the usual 
definition of the cross product of vectors in three dimensions. 

Bringing in a third vector 

w = W1B1 + w 2 e 2 + w 3 e 3 , 

the wedge product of three vectors is 

uAvAw = (uiv 2 w 3 +u 2 v 3 wi+u 3 viw 2 — uiv 3 w 2 — u 2 viw 3 —u 3 v 2 wi)(eiAe 2 Ae 3 ) 

3 3 

where e 1 A e 2 A e^ is the basis vector for the one-dimensional space A (R ). This imitates the usual definition of the 
triple product. 



Grassmann algebra 



49 



The cross product and triple product in three dimensions each admit both geometric and algebraic interpretations. 
The cross product uxv can be interpreted as a vector which is perpendicular to both u and v and whose magnitude is 
equal to the area of the parallelogram determined by the two vectors. It can also be interpreted as the vector 
consisting of the minors of the matrix with columns u and v. The triple product of u, v, and w is geometrically a 
(signed) volume. Algebraically, it is the determinant of the matrix with columns u, v, and w. The exterior product in 
three-dimensions allows for similar interpretations. In fact, in the presence of a positively oriented orthonormal 
basis, the exterior product generalizes these notions to higher dimensions. 

Formal definitions and algebraic properties 

The exterior algebra A(V) over a vector space V is defined as the quotient algebra of the tensor algebra by the 
two-sided ideal / generated by all elements of the form x <E> x such that x G V.^ Symbolically, 

A(V) := T(V)/I. 

The wedge product a of two elements of A(V) is defined by 

a A f3 = a® {3 (mod/). 
Anticommutativity of the wedge product 

The wedge product is alternating on elements of V, which means that x a x = 0 for all x G V. It follows that the 
product is also anticommutative on elements of V, for supposing that xjEV, 

0 = (x + y)A(x + y) = xAx + xAy + yAx + yAy = xAy + yAx 
whence 

x Ay = —y A x. 

More generally, ifx^x^ x^ are elements of V, and a is a permutation of the integers [1, ...,£], then 

2^(1) A x a( 2) A • • • A x a{k) = sgn(cr)x 1 A x 2 A • • • A x k , 

where sgn(a) is the signature of the permutation oP^ 

The exterior power 

k 

The Mi exterior power of V, denoted A (V), is the vector subspace of A(V) spanned by elements of the form 
A X2 A . . . A Xfc, Xi € V, i = 1, 2, . . . , fc. 

k 

If a G A (V), then a is said to be a &-multi vector. If, furthermore, a can be expressed as a wedge product of k 

k 

elements of V, then a is said to be decomposable. Although decomposable multivectors span A (V), not every 

k 4 
element of A (V) is decomposable. For example, in R , the following 2-multivector is not decomposable: 

a. = e\ A e 2 + e 3 A e 4 . 

(This is in fact a symplectic form, since a a a ^ 0 [6] ) 



Grassmann algebra 



50 



Basis and dimension 

If the dimension of Vis n and {e ,...,e } is a basis of V, then the set 
{e^ A e i2 A • • • A e ik | 1 < i\ < i 2 < - - ■ < t& < n} 

k 

is a basis for A (V). The reason is the following: given any wedge product of the form 
Vi A - - - A Vk 

then every vector v . can be written as a linear combination of the basis vectors e .; using the bilinearity of the wedge 

J 1 
product, this can be expanded to a linear combination of wedge products of those basis vectors. Any wedge product 

in which the same basis vector appears more than once is zero; any wedge product in which the basis vectors do not 

appear in the proper order can be reordered, changing the sign whenever two basis vectors change places. In general, 

the resulting coefficients of the basis vectors can be computed as the minors of the matrix that describes the vectors 

v. in terms of the basis e.. 

J i 

By counting the basis elements, the dimension of A (V) is the binomial coefficient C(n,k). In particular, A (V) = {0} 
for k > n. 

Any element of the exterior algebra can be written as a sum of multi vectors. Hence, as a vector space the exterior 
algebra is a direct sum 

A(V) = A°{V) © A}(V) © A 2 {V) © • • • © A n (V) 

(where by convention A°(V) = K and A l (V) = V), and therefore its dimension is equal to the sum of the binomial 
coefficients, which is 2 n . 

Rank of a multivector 

If a G A k (V) 9 then it is possible to express a as a linear combination of decomposable multivectors: 

a = a m + a P) + ... + a (s) 

where each is decomposable, say 

a W =af ) A-Aaf, 1 = 1,2,...,*. 

The rank of the multivector a is the minimal number of decomposable multivectors in such an expansion of a. This 
is similar to the notion of tensor rank. 

Rank is particularly important in the study of 2-multivectors (Sternberg 1974, §111.6) (Bryant et al. 1991). The rank 

of a 2-multi vector a can be identified with half the rank of the matrix of coefficients of a in a basis. Thus if e. is a 

i 

basis for V, then a can be expressed uniquely as 
a = ^ a ij e i A e j 

where a.. - -a., (the matrix of coefficients is skew- symmetric). The rank of the matrix a., is therefore even, and is 

ij Ji y 
twice the rank of the form a. 

In characteristic 0, the 2-multivector a has rank p if and only if 

a A — ■ A a ^ 0 
v 

and 

a A — ■ A a = 0. 



Grassmann algebra 



51 



Graded structure 

The wedge product of a &-multi vector with a /?-multi vector is a (&+/?)-multi vector, once again invoking bilinearity. 
As a consequence, the direct sum decomposition of the preceding section 

A(V) = A°(V) © A^V) © A 2 (V) © • • • © A n (V) 

gives the exterior algebra the additional structure of a graded algebra. Symbolically, 

(A k (V)) A {A P {V)) C A k+p (V). 

Moreover, the wedge product is graded anticommutative, meaning that if a G A k (V) and (3 G A P (V), then 

qA/? = (-1)^Aq. 

In addition to studying the graded structure on the exterior algebra, Bourbaki (1989) studies additional graded 
structures on exterior algebras, such as those on the exterior algebra of a graded module (a module that already 
carries its own gradation). 

Universal property 

Let V be a vector space over the field K. Informally, multiplication in A(V) is performed by manipulating symbols 
and imposing a distributive law, an associative law, and using the identity v a v = 0 for v E V. Formally, A(V) is the 
"most general" algebra in which these rules hold for the multiplication, in the sense that any unital associative 
^-algebra containing V with alternating multiplication on V must contain a homomorphic image of A(V). In other 
words, the exterior algebra has the following universal property: 

Given any unital associative 7^-algebra A and any TT-linear map j : V —> A such that j(v)j(v) = 0 for every v in V, then 
there exists precisely one unital algebra homomorphism/ : A(V) —> A such that/(v) = j(i(v)) for all v in V. 

V — A(V) 

: / 
+ 

To construct the most general algebra that contains V and whose multiplication is alternating on V, it is natural to 
start with the most general algebra that contains V, the tensor algebra T(V), and then enforce the alternating property 
by taking a suitable quotient. We thus take the two-sided ideal / in T(V) generated by all elements of the form v®v 
for v in V, and define A(V) as the quotient 

A(V)=T(V)/I 

(and use A as the symbol for multiplication in A(V)). It is then straightforward to show that A(V) contains V and 
satisfies the above universal property. 

As a consequence of this construction, the operation of assigning to a vector space V its exterior algebra A(V) is a 
functor from the category of vector spaces to the category of algebras. 

Rather than defining A(V) first and then identifying the exterior powers A k (V) as certain subspaces, one may 
alternatively define the spaces A k (V) first and then combine them to form the algebra A(V). This approach is often 
used in differential geometry and is described in the next section. 




Grassmann algebra 



52 



Generalizations 

Given a commutative ring R and an /^-module M, we can define the exterior algebra A(M) just as above, as a suitable 
quotient of the tensor algebra T(M). It will satisfy the analogous universal property. Many of the properties of A(M) 
also require that M be a projective module. Where finite-dimensionality is used, the properties further require that M 
be finitely generated and projective. Generalizations to the most common situations can be found in (Bourbaki 
1989). 

Exterior algebras of vector bundles are frequently considered in geometry and topology. There are no essential 
differences between the algebraic properties of the exterior algebra of finite-dimensional vector bundles and those of 
the exterior algebra of finitely-generated projective modules, by the Serre-Swan theorem. More general exterior 
algebras can be defined for sheaves of modules. 

Duality 

Alternating operators 

Given two vector spaces V and X, an alternating operator (or anti-symmetric operator) from to X is a multilinear 
map 

/ : V k -> X 

such that whenever v 1? ...,v^ are linearly dependent vectors in V, then 

f(v u ...,v k ) = 0 

A well-known example is the determinant, an alternating operator from (f^) n to K. 
The map 

w : V k -> A k (V) 

which associates to k vectors from V their wedge product, i.e. their corresponding ^-vector, is also alternating. In 
fact, this map is the "most general" alternating operator defined on V k : given any other alternating operator/: —> 
X, there exists a unique linear map cp: A k (V) —> X with/= cp o w. This universal property characterizes the space 

k 

A (V) and can serve as its definition. 
Alternating multilinear forms 

The above discussion specializes to the case when X = K, the base field. In this case an alternating multilinear 
function 

/ : V k -> K 

is called an alternating multilinear form. The set of all alternating multilinear forms is a vector space, as the sum 
of two such maps, or the product of such a map with a scalar, is again alternating. By the universal property of the 
exterior power, the space of alternating forms of degree k on V is naturally isomorphic with the dual vector space 
(A k V)*. If V is finite-dimensional, then the latter is naturally isomorphic to A k (V*). In particular, the dimension of 
the space of anti- symmetric maps from to K is the binomial coefficient n choose k. 

Under this identification, the wedge product takes a concrete form: it produces a new anti-symmetric map from two 
given ones. Suppose oo : V* -> K and x\ : V 71 —> K are two anti- symmetric maps. As in the case of tensor products of 
multilinear maps, the number of variables of their wedge product is the sum of the numbers of their variables. It is 
defined as follows: 

(k + m)\ 41 , 
uj A r) = ^— — -^-Alt(u; <g> rj) 
k\ ml 

where the alternation Alt of a multilinear map is defined to be the signed average of the values over all the 
permutations of its variables: 



Grassmann algebra 



53 



Alt(o;)(xi 5 . . . , x k ) = — ^ *g*(°) "fcr(i), ■ ■ ■ , 

This definition of the wedge product is well-defined even if the field K has finite characteristic, if one considers an 
equivalent version of the above that does not use factorials or any constants: 

u)Ar)(x u . . .,x fc+m ) = ^2 BP^Jw^i), . . . ,a; ff (jfc))i/(i ff ( H i), . . . , x 0 .( jfc+m )) ) 

<reSh kjTn 

where here 57* C 5 is the subset of (k,m) shuffles: permutations a of the set {1,2,..., k+m] such that o(l) < a(2) 

K,tTl Kitn ron 

< . . . < a(Jfc), and a(ife+l) < a(&+2)< . . . <a(Jfc+m). L J 
Bialgebra structure 

In formal terms, there is a correspondence between the graded dual of the graded algebra A(V) and alternating 
multilinear forms on V. The wedge product of multilinear forms defined above is dual to a coproduct defined on 
A(V), giving the structure of a coalgebra. 

The coproduct is a linear function A : A(V) — » A(V) ® A(V) given on decomposable elements by 

k 

A(x 1 A - ■ -Aijt) = ^ S s enW(^(i) A- ■ ■Ai ff ( p ))®(i ff ( p+ i) A - • -Ax^). 

For example, 

A(a?i) = 1 ® X! + Xi ® 1, 

A(xi A X2) = 1 ® (xi A X2) + xi (8) ^2 — X2 ® xi + (xi A X2) ® 1. 
This extends by linearity to an operation defined on the whole exterior algebra. In terms of the coproduct, the wedge 
product on the dual space is just the graded dual of the coproduct: 

(a A /?)(x! A . . . A x k ) = (a ® j3) {A(x 1 A ... A x fc )) 
where the tensor product on the right-hand side is of multilinear linear maps (extended by zero on elements of 
incompatible homogeneous degree: more precisely, cxa|3 = 8 o (a®|3) o A, where 8 is the counit, as defined 
presently). 

The counit is the homomorphism 8 : A(V) —> K which returns the 0-graded component of its argument. The 
coproduct and counit, along with the wedge product, define the structure of a bialgebra on the exterior algebra. 

With an antipode defined on homogeneous elements by S(x) = (-l) deg x x, the exterior algebra is furthermore a Hopf 
algebra. 1 

Interior product 

Suppose that V is finite-dimensional. If V* denotes the dual space to the vector space V, then for each a E V , it is 
possible to define an antiderivation on the algebra A(V), 

i a : A k V -> A k ~ l V. 

This derivation is called the interior product with a, or sometimes the insertion operator, or contraction by a. 

Suppose that w G A k V. Then w is a multilinear mapping of V* to K, so it is defined by its values on the &-fold 
Cartesian product V x V x ... x V . If u^, u^, are k-1 elements of V , then define 

{i a w)(u u u 2 ... , Ufe-i) = w(a, u u u 2 , . - - 5 Ufc-i). 

Additionally, let ij= 0 whenever /is a pure scalar (i.e., belonging to A°V). 



Grassmann algebra 



54 



Axiomatic characterization and properties 

The interior product satisfies the following properties: 

1 . For each k and each a E V , 

i a : A h V -> A^V. 

(By convention, A -1 = 0.) 

2. If v is an element of V ( = A 1 V), then i v = a(v) is the dual pairing between elements of V and elements of V*. 

3. For each a £ V , is a graded derivation of degree -1: 

i a (a A b) = (i a a) Ab+ (-l) dega a A (i a b). 
In fact, these three properties are sufficient to characterize the interior product as well as define it in the general 
infinite-dimensional case. 

Further properties of the interior product include: 

• i a o i a = 0. 

• hy. ° ip — —ip 0 in- 

Hodge duality 

Suppose that V has finite dimension n. Then the interior product induces a canonical isomorphism of vector spaces 

A k (V*) ® A n (V) -> A n - k (V). 

In the geometrical setting, a non-zero element of the top exterior power A n (V) (which is a one-dimensional vector 
space) is sometimes called a volume form (or orientation form, although this term may sometimes lead to 
ambiguity.) Relative to a given volume form a, the isomorphism is given explicitly by 

If, in addition to a volume form, the vector space V is equipped with an inner product identifying V with V , then the 
resulting isomorphism is called the Hodge dual (or more commonly the Hodge star operator) 

* : A k (V) -> A n - k (V). 

k k 

The composite of * with itself maps A (V) — » A (V) and is always a scalar multiple of the identity map. In most 
applications, the volume form is compatible with the inner product in the sense that it is a wedge product of an 
orthonormal basis of V. In this case, 

* o * : A k (V) -> A k (V) = 

where / is the identity, and the inner product has metric signature (p,q) — p plusses and q minuses. 

Inner product 

For V a finite-dimensional space, an inner product on V defines an isomorphism of V with V*, and so also an 
isomorphism of A^V with (A k V)*. The pairing between these two spaces also takes the form of an inner product. On 
decomposable &-multi vectors, 

(vi A - - - A VfaWi A • • • A w k ) = det((^ 5 u^-)), 
the determinant of the matrix of inner products. In the special case v. = w., the inner product is the square norm of the 
multivector, given by the determinant of the Gramian matrix (bv„ v.D). This is then extended bilinearly (or 

1 J k 

sesquilinearly in the complex case) to a non-degenerate inner product on A V. If e , /=l,2,...,/2, form an orthonormal 
basis of V, then the vectors of the form 

e h A - Ae ik: i x < • • • < i k , 

k 

constitute an orthonormal basis for A (V). 



Grassmann algebra 



55 



With respect to the inner product, exterior multiplication and the interior product are mutually adjoint. Specifically, 
for v G A k ~ l (V), w G A*(V), and iEV, 

(x Av,w) = (v, vw> 

where x G V is the linear functional defined by 

x\y) = (x,y) 

for all y G V. This property completely characterizes the inner product on the exterior algebra. 

Functoriality 

Suppose that V and W are a pair of vector spaces and / : V —> W is a linear transformation. Then, by the universal 
construction, there exists a unique homomorphism of graded algebras 

A(/) : A(V) -> A(W) 

such that 

A(/)| A i ( v) =f:V = A\V) ^W = A\W). 

In particular, A(f) preserves homogeneous degree. The ^-graded components of A(f) are given on decomposable 
elements by 

A(/)(x! A ... A Xk) = fix,) A ... A f(x k ). 

Let 

A fc (/) = A(f) AHv) : A k (V) -» A*(W). 

The components of the transformation A(&) relative to a basis of V and W is the matrix of k x k minors of /. In 
particular, if V = W and V is of finite dimension n, then A n (f) is a mapping of a one-dimensional vector space A n to 
itself, and is therefore given by a scalar: the determinant off. 

Exactness 

If 

is a short exact sequence of vector spaces, then 

0 -> A 1 (17) A A(f) -> A(V) -> A(W) -> 0 

is an exact sequence of graded vector spaces tl0] as is 

o->A(to ^A(y). [11] 

Direct sums 

In particular, the exterior algebra of a direct sum is isomorphic to the tensor product of the exterior algebras: 
A(V © W) = A(V) ® A(W). 

This is a graded isomorphism; i.e., 

A fc (y©W)= 0 A p (V) © A 9 (W0. 

Slightly more generally, if 

is a short exact sequence of vector spaces then A k ( V) has a filtration 
0 = F° C F 1 C ■ ■ ■ C F k C = A fc (V) 

with quotients : F"P+ l / F p = A fc_p (/7) <g> A P (W) • In particular, if U is 1-dimensional then 



Grassmann algebra 



56 



0 -> U <g> A*" 1 ^) -> A*(V) -> A fc (W) -> 0 

is exact, and if Wis 1 -dimensional then 

0 -> A*(tf) -> A*(V) -> A fe_1 (i7) 

is exact J 1 2 ^ 



The alternating tensor algebra 

ri3i 

If K is a field of characteristic 0, L then the exterior algebra of a vector space V can be canonically identified with 
the vector subspace of T(V) consisting of antisymmetric tensors. Recall that the exterior algebra is the quotient of 
T( V) by the ideal / generated by x ® x. 

Let T r ( V) be the space of homogeneous tensors of degree r. This is spanned by decomposable tensors 

Vi ® . . . (g> IV, u» G V. 
The antisymmetrization (or sometimes the skew-symmetrization) of a decomposable tensor is defined by 



1 X 

Alt^! (g) - ■ ■ (g) ty) — — sgn(cr)i; CT (i) (g> • • • <g) v^ T ) 



where the sum is taken over the symmetric group of permutations on the symbols { l,...,r}. This extends by linearity 
and homogeneity to an operation, also denoted by Alt, on the full tensor algebra T(V). The image Alt(T(V)) is the 
alternating tensor algebra, denoted A(V). This is a vector subspace of T(V), and it inherits the structure of a graded 
vector space from that on T(V). It carries an associative graded product § defined by 

i®s = Alt(*<g> s). 

Although this product differs from the tensor product, the kernel of Alt is precisely the ideal / (again, assuming that K 
has characteristic 0), and there is a canonical isomorphism 

A(V) = A(V). 
Index notation 

Suppose that V has finite dimension n, and that a basis e^ e^ of Vis given, then any alternating tensor t G A r (V) C 
T^V) can be written in index notation as 

t = t ili2 - ir e h (g> e i2 <g) » • • <g) e ir 
where t l \ "' V is completely antisymmetric in its indices. 

The wedge product of two alternating tensors t and s of ranks r and p is given by 



t®s = - — ^— ^ sgn(cr)f-( 1 )-"^W5 iCT ^ 1 ) - iCT ^)e il ® Bi a ® - - - <g) e ir 



The components of this tensor are precisely the skew part of the components of the tensor product s ® t, denoted by 
square brackets on the indices: 

The interior product may also be described in index notation as follows. Let f — ^*o*i-.*r-ibe an antisymmetric 

tensor of rank r. Then, for a G V , i t is an alternating tensor of rank r-1, given by 

a 

(tot)* 1 -**- 1 =r^a J -* f<1 - ir - 1 . 
i=o 

where n is the dimension of V. 



Grassmann algebra 



57 



Applications 

Linear geometry 

The decomposable vectors have geometric interpretations: the bi vector u Av represents the plane spanned by the 
vectors, "weighted" with a number, given by the area of the oriented parallelogram with sides u and v. Analogously, 
the 3 -vector u A v A w represents the spanned 3 -space weighted by the volume of the oriented parallelepiped with 
edges u, v, and w. 

Projective geometry 

k 

Decomposable ^-vectors in A V correspond to weighted ^-dimensional subspaces of V. In particular, the 

Grassmannian of ^-dimensional subspaces of V, denoted Gr 'AV), can be naturally identified with an algebraic 

k 

sub variety of the projective space P(A V). This is called the Pliicker embedding. 
Differential geometry 

The exterior algebra has notable applications in differential geometry, where it is used to define differential forms. A 
differential form at a point of a differentiable manifold is an alternating multilinear form on the tangent space at the 
point. Equivalently, a differential form of degree k is a linear functional on the k-th exterior power of the tangent 
space. As a consequence, the wedge product of multilinear forms defines a natural wedge product for differential 
forms. Differential forms play a major role in diverse areas of differential geometry. 

In particular, the exterior derivative gives the exterior algebra of differential forms on a manifold the structure of a 
differential algebra. The exterior derivative commutes with pullback along smooth mappings between manifolds, and 
it is therefore a natural differential operator. The exterior algebra of differential forms, equipped with the exterior 
derivative, is a differential complex whose cohomology is called the de Rham cohomology of the underlying 
manifold and plays a vital role in the algebraic topology of differentiable manifolds. 

Representation theory 

In representation theory, the exterior algebra is one of the two fundamental Schur functors on the category of vector 
spaces, the other being the symmetric algebra. Together, these constructions are used to generate the irreducible 
representations of the general linear group; see fundamental representation. 

Physics 

The exterior algebra is an archetypal example of a superalgebra, which plays a fundamental role in physical theories 
pertaining to fermions and super symmetry. For a physical discussion, see Grassmann number. For various other 
applications of related ideas to physics, see superspace and supergroup (physics). 

Lie algebra homology 

Let L be a Lie algebra over a field k, then it is possible to define the structure of a chain complex on the exterior 
algebra of L. This is a ^-linear mapping 

d : A P+1 L -> A P L 

defined on decomposable elements by 

1 

d(x 1 A- • -Axp+i) = — — ^2(-l) 3+£ + 1 [x j ,x £ ]Ax 1 A- • -AxjA- • -Ax^A- • -Ax p+1 . 
p + 1 j<£ 

The Jacobi identity holds if and only if 33 = 0, and so this is a necessary and sufficient condition for an 
anticommutative nonassociative algebra L to be a Lie algebra. Moreover, in that case AL is a chain complex with 
boundary operator 3. The homology associated to this complex is the Lie algebra homology. 



Grassmann algebra 



58 



Homological algebra 

The exterior algebra is the main ingredient in the construction of the Koszul complex, a fundamental object in 
homological algebra. 

History 

The exterior algebra was first introduced by Hermann Grassmann in 1844 under the blanket term of 
Ausdehnungslehre, or Theory of Extension} 14 ^ This referred more generally to an algebraic (or axiomatic) theory of 
extended quantities and was one of the early precursors to the modern notion of a vector space. Saint- Venant also 
published similar ideas of exterior calculus for which he claimed priority over Grassmann J 15 ^ 

The algebra itself was built from a set of rules, or axioms, capturing the formal aspects of Cay ley and Sylvester's 
theory of multi vectors. It was thus a calculus, much like the propositional calculus, except focused exclusively on 
the task of formal reasoning in geometrical terms J In particular, this new development allowed for an axiomatic 
characterization of dimension, a property that had previously only been examined from the coordinate point of view. 

ri7i 

The import of this new theory of vectors and multi vectors was lost to mid 19th century mathematicians, until 
being thoroughly vetted by Giuseppe Peano in 1888. Peano's work also remained somewhat obscure until the turn of 
the century, when the subject was unified by members of the French geometry school (notably Henri Poincare, Elie 
Cartan, and Gaston Darboux) who applied Grassmann's ideas to the calculus of differential forms. 

A short while later, Alfred North Whitehead, borrowing from the ideas of Peano and Grassmann, introduced his 
universal algebra. This then paved the way for the 20th century developments of abstract algebra by placing the 
axiomatic notion of an algebraic system on a firm logical footing. 

Notes 

[I] Provided the characteristic is different from 2. 

[2] Grassmann (1844) introduced these as extended algebras (cf. Clifford 1878). He used the word aufiere (literally translated as outer, or 

exterior) only to indicate the produkt he defined, which is nowadays conventionally called exterior product, probably to distinguish it from the 
outer product as defined in modern linear algebra. 

[3] This axiomatization of areas is due to Leopold Kronecker and Karl Weierstrass; see Bourbaki (1989, Historical Note). For a modern 
treatment, see MacLane & Birkhoff (1999, Theorem IX.2.2). For an elementary treatment, see Strang (1993, Chapter 5). 

[4] This definition is a standard one. See, for instance, MacLane & Birkhoff (1999). 

[5] A proof of this can be found in more generality in Bourbaki (1989). 

[6] See Sternberg (1964, §111.6). 

[7] See Bourbaki (1989, 111.7. 1), and MacLane & Birkhoff (1999, Theorem XVI.6.8). More detail on universal properties in general can be found 

in MacLane & Birkhoff (1999, Chapter VI), and throughout the works of Bourbaki. 
[8] Some conventions, particularly in physics, define the wedge product as 

u) A j) — Alt(u; ® rj). 

This convention is not adopted here, but is discussed in connection with alternating tensors. 

[9] Indeed, the exterior algebra of V is the enveloping algebra of the abelian Lie superalgebra structure on V. 

[10] This part of the statement also holds in greater generality if V and W are modules over a commutative ring: That A converts epimorphisms to 
epimorphisms. See Bourbaki (1989, Proposition 3, III.7.2). 

[II] This statement generalizes only to the case where V and W are projective modules over a commutative ring. Otherwise, it is generally not the 
case that A converts monomorphisms to monomorphisms. See Bourbaki (1989, Corollary to Proposition 12, III.7.9). 

[12] Such a filtration also holds for vector bundles, and projective modules over a commutative ring. This is thus more general than the result 

quoted above for direct sums, since not every short exact sequence splits in other abelian categories. 
[13] See Bourbaki (1989, III.7.5) for generalizations. 

[14] Kannenberg (2000) published a translation of Grassmann's work in English; he translated Ausdehnungslehre as Extension Theory. 
[15] J Itard, Biography in Dictionary of Scientific Biography (New York 1970-1990). 

[16] Authors have in the past referred to this calculus variously as the calculus of extension (Whitehead 1898; Forder 1941), or extensive algebra 

(Clifford 1878), and recently as extended vector algebra (Browne 2007). 
[17] Bourbaki 1989, p. 661. 



Grassmann algebra 



59 



References 
Mathematical references 

• Bishop, R.; Goldberg, S.I. (1980), Tensor analysis on manifolds, Dover, ISBN 0-486-64039-6 

Includes a treatment of alternating tensors and alternating forms, as well as a detailed discussion of 
Hodge duality from the perspective adopted in this article. 

• Bourbaki, Nicolas (1989), Elements of mathematics, Algebra I, Springer- Verlag, ISBN 3-540-64243-9 

This is the main mathematical reference for the article. It introduces the exterior algebra of a module 
over a commutative ring (although this article specializes primarily to the case when the ring is a field), 
including a discussion of the universal property, functoriality, duality, and the bialgebra structure. See 
chapters III.7 and III. 11. 

• Bryant, R.L.; Chern, S.S.; Gardner, R.B.; Goldschmidt, H.L.; Griffiths, P.A. (1991), Exterior differential systems, 
Springer- Verlag 

This book contains applications of exterior algebras to problems in partial differential equations. Rank 
and related concepts are developed in the early chapters. 

• MacLane, S.; Birkhoff, G. (1999), Algebra, AMS Chelsea, ISBN 0-8218-1646-2 

Chapter XVI sections 6-10 give a more elementary account of the exterior algebra, including duality, 
determinants and minors, and alternating forms. 

• Sternberg, Shlomo (1964), Lectures on Differential Geometry, Prentice Hall 

Contains a classical treatment of the exterior algebra as alternating tensors, and applications to 
differential geometry. 

Historical references 

• Bourbaki, Nicolas (1989), "Historical note on chapters II and III", Elements of mathematics, Algebra I, 
Springer- Verlag 

• Clifford, W. (1878), "Applications of Grassmann's Extensive Algebra" (http://jstor.org/stable/2369379), 
American Journal of Mathematics (The Johns Hopkins University Press) 1 (4): 350-358, doi: 10.2307/2369379 

• Forder, H. G. (1941), The Calculus of Extension, Cambridge University Press 

• Grassmann, Hermann (1844), Die Lineale Ausdehnungslehre - Ein neuer Zweig der Mathematik (http: //books, 
google, com/books ?id=b Kg AAAAAMAAJ&pg=PAl&dq=Die+Lineale+Ausdehnungslehre+ein+neuer+ 
Zweig+der+Mathematik) (The Linear Extension Theory - A new Branch of Mathematics) 

• Kannenberg, Llyod (2000), Extension Theory (translation of Grassmann's Ausdehnungslehre), American 
Mathematical Society, ISBN 0821820311 

• Peano, Giuseppe (1888), Calcolo Geometrico secondo I Ausdehnungslehre di H. Grassmann preceduto dalle 
Operazioni della Logica Deduttiva; Kannenberg, Lloyd (1999), Geometric calculus: according to the 
Ausdehnungslehre ofH. Grassmann, Birkhauser, ISBN 978-0817641269. 

• Whitehead, Alfred North (1898), A Treatise on Universal Algebra, with Applications (http ://historical. library. 
Cornell . edu/cgi-bin/cul . math/doc vie wer ?did=0 1 95 000 1 & seq=5 ) , Cambridge 



Grassmann algebra 



60 



Other references and further reading 

• Browne, J.M. (2007), Grassmann algebra - Exploring applications of Extended Vector Algebra with 
Mathematica, Published on line (http://www.grassmannalgebra .info/grassmannalgebra/book/index.htm) 

An introduction to the exterior algebra, and geometric algebra, with a focus on applications. Also 
includes a history section and bibliography. 

• Spivak, Michael (1965), Calculus on manifolds, Addison- Wesley, ISBN 978-0805390216 

Includes applications of the exterior algebra to differential forms, specifically focused on integration and 
Stokes's theorem. The notation A k V in this text is used to mean the space of alternating &-forms on V; 
i.e., for Spivak A k V is what this article would call A^V*. Spivak discusses this in Addendum 4. 

• Strang, G. (1993), Introduction to linear algebra, Wellesley-Cambridge Press, ISBN 978-0961408855 

Includes an elementary treatment of the axiomatization of determinants as signed areas, volumes, and 
higher-dimensional volumes. 

• Onishchik, A.L. (2001), "Exterior algebra" (http://eom.springer.de/E/e037080.htm), in Hazewinkel, Michiel, 
Encyclopaedia of Mathematics, Springer, ISBN 978-1556080104 

• Wendell H. Fleming (1965) Functions of Several Variables, Addison- Wesley. 

Chapter 6: Exterior algebra and differential calculus, pages 205-38. This textbook in multivariate 
calculus introduces the exterior algebra of differential forms adroitly into the calculus sequence for 
colleges. 

• Winitzki, S. (2010), Linear Algebra via Exterior Products, Published on line (http://sites.google.com/site/ 
winitzki/linalg) 

An introduction to the coordinate-free approach in basic finite-dimensional linear algebra, using exterior 
products. 



Supergroup 



61 



Supergroup 

The concept of supergroup is a generalization of that of group. In other words, every group is a supergroup but not 
every supergroup is a group. A supergroup is like a Lie group in that there is a well defined notion of smooth 
function defined on them. However the functions may have even and odd parts. Moreover a supergroup has a super 
Lie algebra which plays a role similar to that of a Lie algebra for Lie groups in that they determine most of the 
representation theory and which is the starting point for classification. 

More formally, a Lie supergroup is a supermanifold G together with a multiplication morphism 
fi : G X G — >■ G , an inversion morphism { m Q _> Q and a unit morphism e : 1 — > G which makes G a 
group object in the category of supermanifolds. This means that, formulated as commutative diagrams, the usual 
associativity and inversion axioms of a group continue to hold. Since every manifold is a super manifold, a Lie 
supergroup generalises the notion of a Lie group. 

There are many possible supergroups. The ones of most interest in theoretical physics are the ones which extend the 
Poincare group or the conformal group. Of particular interest are the orthosymplectic groups Osp(N/M) and the 
superconformal groups SU(N/M). 

An equivalent algebraic approach starts from the observation that a super manifold is determined by its ring of 
supercommutative smooth functions, and that a morphism of super manifolds corresponds one to one with an algebra 
homomorphism between their functions in the opposite direction, i.e that the category of supermanifolds is opposite 
to the category of algebras of smooth graded commutative functions. Reversing all the arrows in the commutative 
diagrams that define a Lie supergroup then shows that functions over the supergroup have the structure of a 
Z 2 -graded Hopf algebra. Likewise the representations of this Hopf algebra turn out to be Z 2 -graded comodules. This 
Hopf algebra gives the global properties of the supergroup. 

There is another related Hopf algebra which is the dual of the previous Hopf algebra. It can be identified with the 
Hopf algebra of graded differential operators at the origin. It only gives the local properties of the symmetries i.e., it 
only gives information about infinitesimal supersymmetry transformations. The representations of this Hopf algebra 
are modules. Like in the non graded case, this Hopf algebra can be described purely algebraically as the universal 
enveloping algebra of the Lie superalgebra. 

In a similar way one can define an affine algebraic supergroup as a group object in the category of super algebraic 
affine varieties. An affine algebraic supergroup has a similar one to one relation to its Hopf algebra of super 
Polynomials. Using the language of schemes, which combines the geometric and algebraic point of view, algebraic 
supergroup schemes can be defined including super Abelian varieties. 



Superalgebra 



62 



Superalgebra 

In mathematics and theoretical physics, a superalgebra is a Z 2 -graded algebra. ^ That is, it is an algebra over a 
commutative ring or field with a decomposition into "even" and "odd" pieces and a multiplication operator that 
respects the grading. 

The prefix super- comes from the theory of supersymmetry in theoretical physics. Superalgebras and their 
representations, supermodules, provide an algebraic framework for formulating supersymmetry. The study of such 
objects is sometimes called super linear algebra. Superalgebras also play an important role in related field of 
supergeometry where they enter into the definitions of graded manifolds, supermanifolds and superschemes. 

Formal definition 

Let K be a fixed commutative ring. In most applications, K is a field such as R or C. 
A superalgebra over K is a ^-module A with a direct sum decomposition 

A — Aq® Ai 

together with a bilinear multiplication AxA^A such that 

AiAj C i4.£_|_j 
where the subscripts are read modulo 2. 

A superring, or Z 2 -graded ring, is a superalgebra over the ring of integers Z. 

The elements of A. are said to be homogeneous. The parity of a homogeneous element x, denoted by Ixl, is 0 or 1 
according to whether it is in A Q or A^. Elements of parity 0 are said to be even and those of parity 1 to be odd. If x 
and y are both homogeneous then so is the product xy and \xy\ = \x\ + \y\. 

An associative superalgebra is one whose multiplication is associative and a unital superalgebra is one with a 
multiplicative identity element. The identity element in a unital superalgebra is necessarily even. Unless otherwise 
specified, all superalgebras in this article are assumed to be associative and unital. 

A commutative superalgebra is one which satisfies a graded version of commutativity. Specifically, A is 
commutative if 

yx = (-l)l x H y lxy 
for all homogeneous elements x and y of A. 

Examples 

• Any algebra over a commutative ring K may be regarded as a purely even superalgebra over K; that is, by taking 
A 1 to be trivial. 

• Any Z or N-graded algebra may be regarded as superalgebra by reading the grading modulo 2. This includes 
examples such as tensor algebras and polynomial rings over K. 

• In particular, any exterior algebra over K is a superalgebra. The exterior algebra is the standard example of a 
supercommutative algebra. 

• The symmetric polynomials and alternating polynomials together form a superalgebra, being the even and odd 
parts, respectively. Note that this is a different grading from the grading by degree. 

• Clifford algebras are superalgebras. They are generally noncommutative. 

• The set of all endomorphisms (both even and odd) of a super vector space forms a superalgebra under 
composition. 

• The set of all square supermatrices with entries in K forms a superalgebra denoted by M^(K). This algebra may 
be identified with the algebra of endomorphisms of a free supermodule over K of rank p\q. 



Superalgebra 



63 



• Lie superalgebras are a graded analog of Lie algebras. Lie superalgebras are nonunital and nonassociative; 
however, one may construct the analog of a universal enveloping algebra of a Lie superalgebra which is a unital, 
associative superalgebra. 

Further definitions and constructions 

A superalgebra is an algebra with a Z2 grading ("even" and "odd" elements) such that (i) the bracket of two 
generators is always antisymmetric except for two odd elements where it is symmetric and (ii) the Jacobi identities 
are satisfied 

[Ek, {0 fc , 0 6 }] = {[Ei, 0*], O b } + {[Ei, O b ], 0 fe } 
[o k , {o b , O a }] = [{O k , O k }, O a ] + [{0 fc , OJ, OJ 

The first of these three identities says that the 0 form a representation of the ordinary Lie algebra spanned by E 
(Consider the 0 as vectors on which the E act.) The second is equivalent to the first if the Killing form is nonsingular. 
The last identity restricts the possible representations 0 of the ordinary Lie algebra. This relation is the reason that 
not every ordinary Lie algebra can be extended to a superalgebra. 

Even subalgebra 

Let A be a superalgebra over a commutative ring K. The submodule A , consisting of all even elements, is closed 
under multiplication and contains the identity of A and therefore forms a subalgebra of A, naturally called the even 
subalgebra. It forms an ordinary algebra over K. 

The set of all odd elements A 1 is an A Q -bimodule whose scalar multiplication is just multiplication in A. The product 
in A equips A with a bilinear form 

fM : A 1 ® Ao M -> A 0 
such that 

fi(x <g> y) • z = x • fi(y <g> z) 
for all x, y, and z in A . This follows from the associativity of the product in A. 

Grade involution 

There is a canonical involutive automorphism on any superalgebra called the grade involution. It is given on 
homogeneous elements by 

x= {-lpx 

and on arbitrary elements by 

x = Xq — X\ 

where x are the homogeneous parts of x. If A has no 2-torsion (in particular, if 2 is invertible) then the grade 
involution can be used to distinguish the even and odd parts of A: 

Ai = {x G A : x = {-lfx}. 



Superalgebra 



64 



Supercommutativity 

The supercommutator on A is the binary operator given by 

[x,y] = xy-(-l)WMyx 
on homogeneous elements. This can be extended to all of A by linearity. Elements x and y of A are said to 
supercommute if [x, y] = 0. 

The supercenter of A is the set of all elements of A which supercommute with all elements of A : 

Z(A) = {a e A : [a, x] = 0 for all x <E A}. 

The supercenter of A is, in general, different than the center of A as an ungraded algebra. A commutative 
superalgebra is one whose supercenter is all of A. 

Super tensor product 

The graded tensor product of two superalgebras may be regarded as a superalgebra with a multiplication rule 
determined by: 

(ai ® 6i)(a 2 ® h) = (-l)^^^^ <g> bfa). 

Generalizations and categorical definition 

One can easily generalize the definition of superalgebras to include superalgebras over a commutative superring. The 
definition given above is then a specialization to the case where the base ring is purely even. 

Let R be a commutative superring. A superalgebra over R is a i?-supermodule A with a /^-bilinear multiplication A x 
A^> A that respects the grading. Bilinearity here means that 

r • (xy) = (r • x)y = (— l)^^^ • y) 

for all homogeneous elements r G R and x, y G A. 

Equivalently, one may define a superalgebra over R as a superring A together with an superring homomorphism R —> 
A whose image lies in the supercenter of A. 

One may also define superalgebras categorically. The category of all /?-supermodules forms a monoidal category 
under the super tensor product with R serving as the unit object. An associative, unital superalgebra over R can then 
be defined as a monoid in the category of /?-supermodules. That is, a superalgebra is an 7?-supermodule A with two 
(even) morphisms 

/i : A <g> A -> A 

rj : R —> A 

for which the usual diagrams commute. 



Superalgebra 



65 



Notes 

[1] Kac, Martinez & Zelmanov (2001), p. 3 (http://books.google.com/books ?id=jTCNZz2Tk4cC&pg=PA3&dq="superalgebra"). 
[2] P. van Nieuwenhuizen, Phys. Rep. 68, 189 (1981) 

References 

• Deligne, Pierre; John W. Morgan (1999). "Notes on Supersymmetry (following Joseph Bernstein)". Quantum 
Fields and Strings: A Course for Mathematicians . 1. American Mathematical Society, pp. 41-97. ISBN 
0-8218-2012-5. 

• Manin, Y. I. (1997). Gauge Field Theory and Complex Geometry ((2nd ed.) ed.). Berlin: Springer. 
ISBN 3-540-61378-1. 

• Varadarajan, V. S. (2004). Supersymmetry for Mathematicians: An Introduction. Courant Lecture Notes in 
Mathematics 11. American Mathematical Society. ISBN 0-8218-3574-2. 

• Kac, Victor G.; Martinez, Consuelo; Zelmanov, Efim (2001). Graded simple Jordan superalgebras of growth 
one. Memoirs of the AMS Series. 711. AMS Bookstore. ISBN 9780821826454. 

Supergravity 

In theoretical physics, supergravity (supergravity theory) is a field theory that combines the principles of 
supersymmetry and general relativity. Together, these imply that, in supergravity, the supersymmetry is a local 
symmetry (in contrast to non-gravitational supersymmetric theories, such as the Minimal Supersymmetric Standard 
Model (MSSM)). Since the generators of supersymmetry (SUSY) are convoluted with the Poincare group to form a 
Super-Poincare algebra it is very natural to see that SuperGravity follows naturally from super symmetry J ^ 

Gravitons 

Like any field theory of gravity, a supergravity theory contains a spin-2 field whose quantum is the graviton. 
Supersymmetry requires the graviton field to have a superpartner. This field has spin 3/2 and its quantum is the 
gravitino. The number of gravitino fields is equal to the number of supersymmetries. Supergravity theories are often 
said to be the only consistent theories of interacting massless spin 3/2 fields. 

History 

Four-dimensional SUGRA 

SUGRA, or SUper GRAvity, was initially proposed as a four-dimensional theory in 1976 by Daniel Z. Freedman, 
Peter van Nieuwenhuizen and Sergio Ferrara at Stony Brook University, but was quickly generalized to many 
different theories in various numbers of dimensions and greater number (N) of supersymmetry charges. Supergravity 
theories with N>1 are usually referred to as extended supergravity (SUEGRA). Some supergravity theories were 
shown to be equivalent to certain higher-dimensional supergravity theories via dimensional reduction (e.g. 1 11 
dimensional supergravity is dimensionally reduced on S7 to N = 8 d = 4 SUGRA). The resulting theories were 
sometimes referred to as Kaluza-Klein theories, as Kaluza and Klein constructed, nearly a century ago, a 
five-dimensional gravitational theory, that when dimensionally reduced on circle, its 4-dimensional non-massive 
modes describe electromagnetism coupled to gravity. 



Supergravity 



66 



mSUGRA 

mSUGRA means minimal SUper GRAvity. The construction of a realistic model of particle interactions within the N 
= 1 supergravity framework where supersymmetry (SUSY) is broken by a super Higgs mechanism was carried out 
by Ali Chamseddine, Richard Arnowitt and Pran Nath in 1982. In these classes of models collectively now known as 
minimal supergravity Grand Unification Theories (mSUGRA GUT), gravity mediates the breaking of SUSY through 
the existence of a hidden sector. mSUGRA naturally generates the Soft SUSY breaking terms which are a 
consequence of the Super Higgs effect. Radiative breaking of electroweak symmetry through Renormalization 
Group Equations (RGEs) follows as an immediate consequence. mSUGRA is one of the most widely investigated 
models of particle physics due to it predictive power requiring only four input parameters and a sign, to determine 
the low energy Phenomenology from the scale of Grand Unification. 

lid: the maximal SUGRA 

One of these supergravities, the 11 -dimensional theory, generated considerable excitement as the first potential 
candidate for the theory of everything. This excitement was built on four pillars, two of which have now been largely 
discredited: 

• Werner Nahm showed that 1 1 dimensions was the largest number of dimensions consistent with a single graviton, 
and that a theory with more dimensions would also have particles with spins greater than 2. These problems are 
avoided in 12 dimensions if two of these dimensions are timelike, as has been often emphasized by Itzhak Bars. 

• In 1981, Ed Witten showed that 1 1 was the smallest number of dimensions that was big enough to contain the 
gauge groups of the Standard Model, namely SU(3) for the strong interactions and SU(2) times U(l) for the 
electroweak interactions. Today many techniques exist to embed the standard model gauge group in supergravity 
in any number of dimensions. For example, in the mid and late 1980s one often used the obligatory gauge 
symmetry in type I and heterotic string theories. In type II string theory they could also be obtained by 
compactifying on certain Calabi-Yau's. Today one may also use D-branes to engineer gauge symmetries. 

• In 1978, Eugene Cremmer, Bernard Julia and Joel Scherk (CJS) of the Ecole Normale Superieure found the 
classical action for an 11 -dimensional supergravity theory. This remains today the only known classical 

11 -dimensional theory with local supersymmetry and no fields of spin higher than two. Other 11 -dimensional 
theories are known that are quantum-mechanically inequivalent to the CJS theory, but classically equivalent (that 
is, they reduce to the CJS theory when one imposes the classical equations of motion). For example, in the mid 
1980s Bernard de Wit and Hermann Nicolai found an alternate theory in D=ll Supergravity with Local SU(8) 
Invariance . This theory, while not manifestly Lorentz-invariant, is in many ways superior to the CJS theory in 
that, for example, it dimensionally-reduces to the 4-dimensional theory without recourse to the classical equations 
of motion. 

• In 1980, Peter G. O. Freund and M. A. Rubin showed that compactification from 11 dimensions preserving all the 
SUSY generators could occur in two ways, leaving only 4 or 7 macroscopic dimensions (the other 7 or 4 being 
compact). Unfortunately, the noncompact dimensions have to form an anti de Sitter space. Today it is understood 
that there are many possible compactifications, but that the Freund-Rubin compactifications are invariant under 
all of the supersymmetry transformations that preserve the action. 

Thus, the first two results appeared to establish 11 dimensions uniquely, the third result appeared to specify the 
theory, and the last result explained why the observed universe appears to be four-dimensional. 

Many of the details of the theory were fleshed out by Peter van Nieuwenhuizen, Sergio Ferrara and Daniel Z. 
Freedman. 



Supergravity 



67 



The end of the SUGRA era 

The initial excitement over 11 -dimensional supergravity soon waned, as various failings were discovered, and 
attempts to repair the model failed as well. Problems included: 

• The compact manifolds which were known at the time and which contained the standard model were not 
compatible with super- symmetry, and could not hold quarks or leptons. One suggestion was to replace the 
compact dimensions with the 7-sphere, with the symmetry group SO(8), or the squashed 7-sphere, with symmetry 
group SO(5) times SU(2). 

• Until recently, the physical neutrinos seen in the real world were believed to be massless, and appeared to be 
left-handed, a phenomenon referred to as the chirality of the Standard Model. It was very difficult to construct a 
chiral fermion from a compactification — the compactified manifold needed to have singularities, but physics 
near singularities did not begin to be understood until the advent of orbifold conformal field theories in the late 
1980s. 

• Supergravity models generically result in an unrealistically large cosmological constant in four dimensions, and 
that constant is difficult to remove, and so require fine-tuning. This is still a problem today. 

• Quantization of the theory led to quantum field theory gauge anomalies rendering the theory inconsistent. In the 
intervening years physicists have learned how to cancel these anomalies. 

Some of these difficulties could be avoided by moving to a 10-dimensional theory involving superstrings. However, 
by moving to 10 dimensions one loses the sense of uniqueness of the 11 -dimensional theory. 

The core breakthrough for the 10-dimensional theory, known as the first superstring revolution, was a demonstration 
by Michael B. Green, John H. Schwarz and David Gross that there are only three supergravity models in 10 
dimensions which have gauge symmetries and in which all of the gauge and gravitational anomalies cancel. These 
were theories built on the groups SO (3 2) and E% X Eg, the direct product of two copies of E . Today we know 
that, using D-branes for example, gauge symmetries can be introduced in other 10-dimensional theories as well. 

The second superstring revolution 

Initial excitement about the lOd theories, and the string theories that provide their quantum completion, died by the 
end of the 1980s. There were too many Calabi-Yaus to compactify on, many more than Yau had estimated, as he 
admitted in December 2005 at the 23rd International Solvay Conference in Physics. None quite gave the standard 
model, but it seemed as though one could get close with enough effort in many distinct ways. Plus no one understood 
the theory beyond the regime of applicability of string perturbation theory. 

There was a comparatively quiet period at the beginning of the 1990s, during which, however, several important 
tools were developed. For example, it became apparent that the various superstring theories were related by "string 
dualities", some of which relate weak string-coupling (i.e. perturbative) physics in one model with strong 
string-coupling (i.e. non-perturbative) in another. 

Then it all changed, in what is known as the second superstring revolution. Joseph Polchinski realized that obscure 
string theory objects, called D-branes, which he had discovered six years earlier, are stringy versions of the p-branes 
that were known in supergravity theories. The treatment of these p-branes was not restricted by string perturbation 
theory; in fact, thanks to supersymmetry, p-branes in supergravity were understood well beyond the limits in which 
string theory was understood. 

Armed with this new nonperturbative tool, Edward Witten and many others were able to show that all of the 
perturbative string theories were descriptions of different states in a single theory which he named M-theory. 
Furthermore he argued that the long wavelength limit of M-theory should be described by the 11 -dimensional 
supergravity that had fallen out of favor with the first superstring revolution 10 years earlier, accompanied by the 2- 
and 5-branes. [*= i.e. when the quantum wavelength associated to objects in the theory are much larger than the size 
of the 1 1th dimension]. 



Supergravity 



68 



Historically, then, supergravity has come "full circle". It is a commonly used framework in understanding features of 
string theories, M-theory and their compactifications to lower spacetime dimensions. 

Relation to superstrings 

Particular 10-dimensional supergravity theories are considered "low energy limits" of the 10-dimensional superstring 
theories; more precisely, these arise as the massless, tree-level approximation of string theories. True effective field 
theories of string theories, rather than truncations, are rarely available. Due to string dualities, the conjectured 
11 -dimensional M-theory is required to have 11 -dimensional supergravity as a "low energy limit". However, this 
doesn't necessarily mean that string theory/M-theory is the only possible UV completion of supergravity; 
supergravity research is useful independent of those relations. 

4D/V= 1 SUGRA 

Before we move on to SUGRA proper, let's recapitulate some important details about general relativity. We have a 
4D differentiable manifold M with a Spin(3,l) principal bundle over it. This principal bundle represents the local 
Lorentz symmetry. In addition, we have a vector bundle T over the manifold with the fiber having four real 
dimensions and transforming as a vector under Spin(3,l). We have an invertible linear map from the tangent bundle 
TM to T. This map is the vierbein. The local Lorentz symmetry has a gauge connection associated with it, the spin 
connection. 

The following discussion will be in superspace notation, as opposed to the component notation, which isn't 
manifestly covariant under SUSY. There are actually many different versions of SUGRA out there which are 
inequivalent in the sense that their actions and constraints upon the torsion tensor are different, but ultimately 
equivalent in that we can always perform a field redefinition of the supervierbeins and spin connection to get from 
one version to another. 

In 4D N=l SUGRA, we have a 414 real differentiable supermanifold M, i.e. we have 4 real bosonic dimensions and 4 

real fermionic dimensions. As in the nonsupersymmetric case, we have a Spin(3,l) principal bundle over M. We 
414 

have an R vector bundle T over M. The fiber of T transforms under the local Lorentz group as follows; the four 
real bosonic dimensions transform as a vector and the four real fermionic dimensions transform as a Majorana 
spinor. This Majorana spinor can be reexpressed as a complex left-handed Weyl spinor and its complex conjugate 
right-handed Weyl spinor (they're not independent of each other). We also have a spin connection as before. 

We will use the following conventions; the spatial (both bosonic and fermionic) indices will be indicated by M, N, ... 
. The bosonic spatial indices will be indicated by \i, v, the left-handed Weyl spatial indices by a, (3,..., and the 
right-handed Weyl spatial indices by q , 0 , ... . The indices for the fiber of T will follow a similar notation, except 

that they will be hatted like this: , a. • See van der Waerden notation for more details. M = (/i, a, a) . The 

supervierbein is denoted by e ^ , and the spin connection by ^mnp- The inverse supervierbein is denoted by 

The supervierbein and spin connection are real in the sense that they satisfy the reality conditions 

ejf 0, Of = ejT (x : 0, 6) where /i* = fi , a * = a , and a* = a and u;(x, 0, 9)* = uj{x, 9 : d) 

The covariant derivative is defined as 

D^f = E%{d N f + u N \J]). 

The covariant exterior derivative as defined over supermanifolds needs to be super graded. This means that every 
time we interchange two fermionic indices, we pick up a +1 sign factor, instead of -1. 

The presence or absence of R symmetries is optional, but if R-symmetry exists, the integrand over the full 
superspace has to have an R-charge of 0 and the integrand over chiral superspace has to have an R-charge of 2. 



Supergravity 



69 



A chiral superfield X is a superfield which satisfies D^X = 0. In order for this constraint to be consistent, we 
require the integrability conditions that = C ^^7^ or some coefficients c. 

Unlike nonSUSY GR, the torsion has to be nonzero, at least with respect to the fermionic directions. Already, even 
in flat superspace, D&e^ + D^e& 7^ 0. In one version of SUGRA (but certainly not the only one), we have the 

following constraints upon the torsion tensor: 

T\ = 0 

a/3 

2f- = 0 
T A . = 0 

if- = 0 
TP- = 0 

Here, Q, is a shorthand notation to mean the index runs over either the left or right Weyl spinors. 

The superdeterminant of the supervierbein, |e| , gives us the volume factor for M. Equivalently, we have the 

volume 414-superform e £=° /\ . . . /\ e £= 3 /\ e <*=l /\ e <*=2 ^ e ^=l ^ g a=2 . 

If we complexify the superdiffeomorphisms, there is a gauge where E~ = 0, E? = Oand £^ = The 
resulting chiral superspace has the coordinates x and 0. 

R is a scalar valued chiral superfield derivable from the supervielbeins and spin connection. If / is any superfield, 
(p 2 - 8R) f is always a chiral superfield. 

The action for a SUGRA theory with chiral superfields X, is given by 

+ c.c. 



S = J d*xd 2 Q2£ [| (D 2 - 8fl) e"* ( X ' X V 3 + W(X) 



where 7^ is the Kahler potential and W is the superpotential, and £ is the chiral volume factor. Unlike the case for 
flat superspace, adding a constant to either the Kahler or superpotential is now physical. A constant shift to the 
Kahler potential changes the effective Planck constant, while a constant shift to the superpotential changes the 
effective cosmological constant. As the effective Planck constant now depends upon the value of the chiral 
superfield X, we need to rescale the supervierbeins (a field redefinition) to get a constant Planck constant. This is 
called the Einstein frame. 



Supergravity 



70 



Higher-dimensional SUGRA 

See the article higher-dimensional supergravity for more details. 

References 
Historical 

[1] P. van Nieuwenhuizen, Phys. Rep. 68, 189 (1981) 
[2] http://ccdb4fs.kek.jp/cgi-bin/img_index78604009 

• D.Z. Freedman, P. van Nieuwenhuizen and S. Ferrara, "Progress Toward A Theory Of Supergravity", Physical 
Review D13 (1976) pp 3214-3218. 

• E. Cremmer, B. Julia and J. Scherk, "Supergravity theory in eleven dimensions", Physics Letters B76 (1978) pp 
409-412. scanned version (http://www-lib.kek.jp/cgi-bin/img_index77805106) 

• P. Freund and M. Rubin, "Dynamics of dimensional reduction", Physics Letters B97 (1980) pp 233-235. 

• Ali H. Chamseddine, R. Arnowitt, Pran Nath, "Locally Supersymmetric Grand Unification", " Phys. 
Rev.Lett.49:970,1982" 

• Michael B. Green, John H. Schwarz, "Anomaly Cancellation in Supersymmetric D=10 Gauge Theory and 
Superstring Theory", Physics Letters B149 (1984) ppl 17-122. 

General 

• Bernard de Wit(2002) Supergravity (http://arxiv.org/abs/hep-th/0212245vl) 

• A Supersymmetry primer (http://arxiv.org/abs/hep-ph/9709356) (1998) updated in (2006), (the user friendly 
guide). 

• Adel Bilal, Introduction to supersymmetry (http://arxiv.org/hep-th/0101055) (2001) ArXiv hep-th/01 01055, (a 
comprehensive introduction to supersymmetry). 

• Friedemann Brandt, Lectures on supergravity (http://arxiv.org/abs/hep-th/0204035) (2002) ArXiv 
hep-th/0204035, (an introduction to 4 -dimensional N = 1 supergravity). 

• Wess, Julius; Bagger, Jonathan (1992). Supersymmetry and Supergravity. Princeton University Press, pp. 260. 
ISBN 0691025304. 



Quantum statistical mechanics 



71 



Quantum statistical mechanics 

Quantum statistical mechanics is the study of statistical ensembles of quantum mechanical systems. A statistical 
ensemble is described by a density operator S, which is a non-negative, self-adjoint, trace-class operator of trace 1 on 
the Hilbert space H describing the quantum system. This can be shown under various mathematical formalisms for 
quantum mechanics. One such formalism is provided by quantum logic. 

Expectation 

From classical probability theory, we know that the expectation of a random variable X is completely determined by 
its distribution by 

Exppf) = / XdD x {X) 

assuming, of course, that the random variable is integrable or that the random variable is non-negative. Similarly, let 
A be an observable of a quantum mechanical system. A is given by a densely defined self-adjoint operator on H. The 
spectral measure of A defined by 

E A (U) = [ AdE(A), 
Ju 

uniquely determines A and conversely, is uniquely determined by A. is a boolean homomorphism from the Borel 
subsets of R into the lattice Q of self-adjoint projections of H. In analogy with probability theory, given a state S, we 
introduce the distribution of A under S which is the probability measure defined on the Borel subsets of R by 

D A (U) = Tr(E A (U)S). 

Similarly, the expected value of A is defined in terms of the probability distribution by 

Exp (4) = [ XdB A (X). 

Note that this expectation is relative to the mixed state S which is used in the definition of D^. 

Remark. For technical reasons, one needs to consider separately the positive and negative parts of A defined by the 
Borel functional calculus for unbounded operators. 

One can easily show: 

Exp(A) = Tr(AS) = Tr{SA). 

Note that if S is a pure state corresponding to the vector ip, 

Exp(^) = (1>\A\1>). 
Von Neumann entropy 

Of particular significance for describing randomness of a state is the von Neumann entropy of S formally defined by 
H(5) = -Tr(51og 2 5). 

Actually, the operator S log 2 S is not necessarily trace-class. However, if S is a non-negative self-adjoint operator not 
of trace class we define Tr(S) = +00. Also note that any density operator S can be diagonalized, that it can be 
represented in some orthonormal basis by a (possibly infinite) matrix of the form 



Ai 


0 •• 


• 0 


0 


A 2 •• 


• 0 


0 


0 ■■ 


■ K 



Quantum statistical mechanics 



72 



and we define 

H(5) = -J]A i log 2 A i . 

i 

The convention is that 0 log 2 0 = 0, since an event with probability zero should not contribute to the entropy. This 

value is an extended real number (that is in [0, <»]) and this is clearly a unitary invariant of S. 

Remark. It is indeed possible that H(S) = +°« for some density operator S. In fact The the diagonal matrix 

n o 
0 



T = 



2(log 2 2)2 

0 



0 

1 



0 



3(log 2 3)2 

0 



n(log 2 n) 2 



T is non-negative trace class and one can show T log 2 T is not trace-class. 
Theorem. Entropy is a unitary invariant. 

In analogy with classical entropy (notice the similarity in the definitions), H(S) measures the amount of randomness 
in the state S. The more dispersed the eigenvalues are, the larger the system entropy. For a system in which the space 
H is finite-dimensional, entropy is maximized for the states S which in diagonal form have the representation 

r ± 0 ••• 0" 



0 I 



0 



0 0 ••• i 

n. 

For such an S, H(S) = log 2 n. The state S is called the maximally mixed state. 
Recall that a pure state is one of the form 

s = |v) (VI, 

for a vector of norm 1 . 

Theorem. R(S) = 0 if and only if S is a pure state. 

For S is a pure state if and only if its diagonal form has exactly one non-zero entry which is a 1 . 
Entropy can be used as a measure of quantum entanglement. 



Gibbs canonical ensemble 

Consider an ensemble of systems described by a Hamiltonian H with average energy E. If H has pure-point spectrum 
and the eigenvalues E n of H go to + °o sufficiently fast, e will be a non-negative trace-class operator for every 
positive r. 

The Gibbs canonical ensemble is described by the state 

where (3 is such that the ensemble average of energy satisfies 
Tt{SH) = E 

,and 

Tr(e-H = 2>-^ 

n 

is the quantum mechanical version of the canonical partition function. The probability that a system chosen at 
random from the ensemble will be in a state corresponding to energy eigenvalue is 



Quantum statistical mechanics 



73 



Under certain conditions, the Gibbs canonical ensemble maximizes the von Neumann entropy of the state subject to 
the energy conservation requirement. 

References 

• J. von Neumann, Mathematical Foundations of Quantum Mechanics, Princeton University Press, 1955. 

• F. Reif, Statistical and Thermal Physics, McGraw-Hill, 1965. 

Quantum thermodynamics 

In the physical sciences, quantum thermodynamics is the study of heat and work dynamics in quantum systems. 
Approximately, quantum thermodynamics attempts to combine thermodynamics and quantum mechanics into a 
coherent whole. The essential point at which "quantum mechanics" began was when, in 1900, Max Planck outlined 
the "quantum hypothesis", i.e. that the energy of atomic systems can be quantized, as based on the first two laws of 
thermodynamics as described by Rudolf Clausius (1865) and Ludwig Boltzmann (1877).^ ^ See the history of 
quantum mechanics for a more detailed outline. 

Overview 

A central objective in quantum thermodynamics is the quantitative and qualitative determination of the laws of 
thermodynamics at the quantum level in which uncertainty and probability begin to take effect. A fundamental 
question is: what remains of thermodynamics if one goes to the extreme limit of small quantum systems having a 
few degrees of freedom? If thermodynamics applies at this level, does the second law of thermodynamics remain 
unchanged, or is there a more universal formulation than the many existing formulations, such as: the entropy of a 
closed system cannot decrease; heat flows from high to low temperature; systems evolve towards minimum potential 
energy wells; energy tends to dissipate; and so on. 

References 

[1] Planck, Max. (1900). "Entropy and Temperature of Radiant Heat (http://www.iee.org/publish/inspec/prodcat/1900A01446.xml). " 

Annalen der Physick, vol. 1. no 4. April, pg. 719-37. 
[2] Planck, Max. (1901). " On the Law of Distribution of Energy in the Normal Spectrum (http://dbhs.wvusd.kl2.ca.us/webdocs/ 

Chem-History/Planck-1901/Planck-1901.html)". Annalen der Physik, vol. 4, p. 553 ff. 

Further reading 

1. Gemmer, J., Michel, M., Mahler, G. (2005). Quantum Thermodynamics — Emergence of Thermodynamic 
Behavior Within Composite Quantum Systems. Springer. ISBN 3-540-22911-6. 

2. Rudakov, E.S. (1998). Molecular, Quantum and Evolution Thermodynamics: Development and Specialization of 
the Gibbs Method.. Donetsk State University Press. ISBN 966-02-0708-5. 



Quantum thermodynamics 



74 



External links 

• Quantum Thermodynamics and the Gibbs Paradox (http://staff.science.uva.nl/~nieuwenh/QL2L.html) 

• Quantum Thermodynamics (http://www.chaos.org.uk/~eddy/physics/heat.html) 

• On the Classical Limit of Quantum Thermodynamics in Finite Time (http://www.fh.huji.ac.il/~ronnie/ 
Paper s/ge va92.pdf) [PDF-format] 

• Quantum Thermodynamics (http://www.quantumthermodynamics.org) - list of good related articles 

Supertheory 

The theory of everything (TOE) is a putative theory of theoretical physics that fully explains and links together all 
known physical phenomena, and, ideally, has predictive power for the outcome of any experiment that could be 
carried out in principle. 

Initially, the term was used with an ironic connotation to refer to various overgeneralized theories. For example, a 
great-grandfather of Ijon Tichy — a character from a cycle of Stanislaw Lem's science fiction stories of the 
1960s — was known to work on the "General Theory of Everything". Physicist John Ellis ^ claims to have introduced 
the term into the technical literature in an article in Nature in 1986. Over time, the term stuck in popularizations of 
quantum physics to describe a theory that would unify or explain through a single model the theories of all 
fundamental interactions and of all particles of nature: general relativity for gravitation, and the standard model of 
elementary particle physics - which includes quantum mechanics - for electromagnetism, the two nuclear 
interactions, and the known elementary particles. 

There have been many theories of everything proposed by theoretical physicists over the twentieth century, but none 
have been confirmed experimentally. The primary problem in producing a TOE is that the accepted theories of 
quantum mechanics and general relativity are hard to combine. Their mutual incompatibility makes their unification 
a difficult task. The combination is one of the unsolved problems in physics. 

Based on theoretical holographic principle arguments from the 1990s, many physicists believe that 11 -dimensional 
M-theory, which is described in many sectors by matrix string theory, in many other sectors by perturbative string 
theory, is the complete theory of everything. However, there is no widespread consensus on this issue, because 
M-theory is not a completed theory but rather an approach for producing one. 

Historical antecedents 
Ancient Greece to Einstein 

Archimedes was possibly the first scientists to describe nature with axioms (or principles) and then to deduce new 
results from them. The putative theory of everything is expected to achieve the deduction of all phenomena from 
basic axioms. 

Since ancient Greek times, philosophers have speculated that the apparent diversity of appearances conceals an 
underlying unity, and thus that the list of forces might be short, indeed might contain only a single entry. For 
example, the mechanical philosophy of the 17th century posited that all forces could be ultimately reduced to contact 
forces between tiny solid particles. This was abandoned after the acceptance of Isaac Newton's long-distance force 
of gravity; but at the same time, Newton's work in his Principia provided the first dramatic empirical evidence for 
the unification of apparently distinct forces: Galileo's work on terrestrial gravity, Kepler's laws of planetary motion, 
and the phenomenon of tides were all quantitatively explained by a single law of universal gravitation. 

Building on these results, Laplace famously suggested that a sufficiently powerful intellect could, if it knew the 
position and velocity of every particle at a given time, along with the laws of nature, calculate the position of any 



Supertheory 



75 



particle at any other time: 

An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all 
items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it 
would embrace in a single formula the movements of the greatest bodies of the universe and those of the 
tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present 
before its eyes. 

— Essai philosophique sur les probabilites, Introduction. 1814 

Modern quantum mechanics implies that uncertainty is inescapable, and thus that Laplace's vision cannot be correct. 
A theory of everything this must include quantum mechanics and gravitation. 

In 1820, Hans Christian 0rsted discovered a connection between electricity and magnetism, triggering decades of 
work that culminated in James Clerk Maxwell's theory of electromagnetism. Also during the 19th and early 20th 
centuries, it gradually became apparent that many common examples of forces — contact forces, elasticity, viscosity, 
friction, pressure — resulted from electrical interactions between the smallest particles of matter. 

In the late 1920s, the new quantum mechanics showed that the chemical bonds between atoms were examples of 
(quantum) electrical forces, justifying Dirac's boast that "the underlying physical laws necessary for the 
mathematical theory of a large part of physics and the whole of chemistry are thus completely known" ^ 

Attempts to unify gravity with electromagnetism date back at least to Michael Faraday's experiments of 1 849-50 
After Albert Einstein's theory of gravity (general relativity) was published in 1915, the search for a unified field 
theory combining gravity with electromagnetism began in earnest. At the time, it seemed plausible that no other 
fundamental forces exist. Prominent contributors were Gunnar Nordstrom, Hermann Weyl, Arthur Eddington, 
Theodor Kaluza, Oskar Klein, and most notably, many attempts by Einstein and his collaborators. In his last years, 
Albert Einstein was intensely occupied in finding such a unifying theory. None of these attempts was successful. ^ 

The nuclear interactions 

In the twentieth century, the search for a unifying theory was interrupted by the discovery of the strong and weak 
nuclear forces (or interactions), which differ both from gravity and from electromagnetism. A further hurdle was the 
acceptance that quantum mechanics had to be incorporated from the start, rather than emerging as a consequence of a 
deterministic unified theory, as Einstein had hoped. 

Gravity and electromagnetism could always peacefully coexist as entries in a list of classical forces, but for many 
years it seemed that gravity could not even be incorporated into the quantum framework, let alone unified with the 
other fundamental forces. For this reason, work on unification, for much of the twentieth century, focused on 
understanding the three "quantum" forces: electromagnetism and the weak and strong forces. The first two were 
combined in 1967-68 by Sheldon Glashow, Steven Weinberg, and Abdus Salam into the "electro weak" force. 
However, while the strong and electroweak forces peacefully coexist in the Standard Model of particle physics, they 
remain distinct. 

Electroweak unification is a broken symmetry: the electromagnetic and weak forces appear distinct at low energies 

2 2 

because the particles carrying the weak force, the W and Z bosons, with masses of 80.4 GeV/c and 91.2 GeV/c , 
whereas the photon, which carries the electromagnetic force, is massless. At higher energies Ws and Zs can be 
created easily and the unified nature of the force becomes apparent. 

Several Grand Unified Theories (GUTs) have been proposed to unify electromagnetism and the weak and strong 
forces. Grand unification is expected to set in at energies of the order of 10 16 GeV, far greater than could be reached 
by any possible Earth-based particle accelerator. Although the simplest GUTs have been experimentally ruled out, 
the general idea, especially when linked with supersymmetry, remains a favorite candidate in the theoretical physics 
community. 



Supertheory 



76 



Modern physics 

The conventional pattern of theories 

A Theory of Everything would unify all the fundamental interactions of nature: gravitation, strong interaction, weak 
interaction, and electromagnetism. Because the weak interaction can transform elementary particles from one kind 
into another, the TOE should also yield a deep understanding of the various different kinds of possible particles. The 
usual assumed path of theories is given in the following graph, where each unification step leads one level higher: 

Theory of 
Everything 

Gravitation Electronuclear force 

(GUT) 

Strong Electroweak 
interaction force 
SU(3) SU(2) x U(l) 

Weak Electromagnetism 
interaction U(l) 
SU(2) 

Electricity Magnetism 

In this graph, electroweak unification occurs at around 100 GeV, grand unification is predicted to occur at 10 16 GeV, 

19 

and unification of the GUT force with gravity is expected at the Planck energy, roughly 10 GeV. 

In addition to the forces listed in the graph, a TOE must also explain the status of at least to candidate forces 
suggested by modern cosmology: an inflationary force and dark energy. Furthermore, cosmological experiments also 
suggest the existence of dark matter, supposedly composed of fundamental particles outside the scheme of the 
standard model. However, the existence of these forces and particles has not been proven yet. 

It may seem premature to be searching for a TOE when there is as yet no direct evidence for an electronuclear force, 
and while in any case there are many different proposed GUTs for this force. Nevertheless, most physicists believe 
that a GUT is possible, mainly due to the past history of convergence towards a single theory. Super symmetric 
GUTs seem plausible not only for their theoretical "beauty", but because they naturally produce large quantities of 
dark matter, and the inflationary force may be related to GUT physics (although it does not seem to form an 
inevitable part of the theory). Yet GUTs are clearly not the final answer. Both the current standard model and all 
proposed GUTs are quantum field theories which require the problematic technique of renormalization to yield 
sensible answers. This is usually regarded as a sign that these are only effective field theories, omitting crucial 
phenomena relevant only at very high energies. Furthermore, the inconsistency between quantum mechanics and 
general relativity implies that one or both of these must be replaced by a theory incorporating quantum gravity. 



String theory and M-theory 

The mainstream theory of everything at the moment is superstring theory / M-theory. These theories attempt to deal 
with the renormalization problem by setting up some lower bound on the length scales possible. 

String theories and supergravity (both believed to be limiting cases of the yet-to-be-defined M-theory) suppose that 
the universe actually has more dimensions than the easily observed three of space and one of time. The motivation 
behind this approach began with the Kaluza-Klein theory in which it was noted that applying general relativity to a 
five dimensional universe (with the usual four dimensions plus one small curled-up dimension) yields the equivalent 
of the usual general relativity in four dimensions together with Maxwell's equations (electromagnetism, also in four 
dimensions). This has led to efforts to work with theories with large number of dimensions in the hopes that this 



Supertheory 



77 



would produce equations that are similar to known laws of physics. The notion of extra dimensions also helps to 
resolve the hierarchy problem, which is the question of why gravity is so much weaker than any other force. The 
common answer involves gravity leaking into the extra dimensions in ways that the other forces do not. 

In the late 1990s, it was noted that one problem with several of the candidates for theories of everything (but 

particularly string theory) was that they did not constrain the characteristics of the predicted universe. For example, 

many theories of quantum gravity can create universes with arbitrary numbers of dimensions or with arbitrary 

cosmological constants. Even the "standard" ten-dimensional string theory allows the "curled up" dimensions to be 

500 

compactified in an enormous number of different ways (one estimate is 10 ) each of which corresponds to a 
different collection of fundamental particles and low-energy forces. This array of theories is known as the string 
theory landscape. 

A speculative solution is that many or all of these possibilities are realised in one or another of a huge number of 
universes, but that only a small number of them are habitable, and hence the fundamental constants of the universe 
are ultimately the result of the anthropic principle rather than a consequence of the theory of everything. This 
anthropic approach is often criticised in that, because the theory is flexible enough to encompass almost any 
observation, it cannot make useful (as in original, falsifiable, and verifiable) predictions. In this view, string theory 
would be considered a pseudoscience, where an unfalsifiable theory is constantly adapted to fit the experimental 
results. 

Loop quantum gravity 

Current research on loop quantum gravity may eventually play a fundamental role in a TOE, but that is not its 
primary aim. Loop quantum gravity is facing difficulties in incorporating electromagnetism and the nuclear 
interactions. 

Other attempts 

Any TOE must include general relativity and the standard model of particle physics. Outside the previously 
mentioned attempts, the best-known one is Garrett Lisi's E8 proposal. 

Present status 

At present, no convincing candidate for a TOE is available. Most particle physicists tend to state that the outcome of 
the ongoing experiments at the large particle accelerators, the LCH and the Tevatron, are needed in order to provide 
theoreticians with precise input for such a theory. 

Arguments against a theory of everything 
Godel's incompleteness theorem 

A small number of scientists claim that Godel's incompleteness theorem proves that any attempt to construct a TOE 
is bound to fail. Godel's theorem, informally stated, asserts that any formal theory expressive enough for elementary 
arithmetical facts to be expressed and strong enough for them to be proved is either inconsistent (both a statement 
and its denial can be derived from its axioms) or incomplete, in the sense that there is a true statement about natural 
numbers that can't be derived in the formal theory. In his 1966 book The Relevance of Physics, Stanley Jaki pointed 
out that, because any "theory of everything" will certainly be a consistent non-trivial mathematical theory, it must be 
incomplete. He claims that this dooms searches for a deterministic theory of everything. 1 In a later reflection, Jaki 
states that it is wrong to say that a final theory is impossible, but rather that "when it is on hand one cannot know 
rigorously that it is a final theory." ^ 

Freeman Dyson has stated that 



Supertheory 



78 



Godel's theorem implies that pure mathematics is inexhaustible. No matter how many problems we solve, there will always be other problems 
that cannot be solved within the existing rules. [...] Because of Godel's theorem, physics is inexhaustible too. The laws of physics are a finite 
set of rules, and include the rules for doing mathematics, so that Godel's theorem applies to them. 

— NYRB, May 13,2004 

Stephen Hawking was originally a believer in the Theory of Everything but, after considering Godel's Theorem, 
concluded that one was not obtainable. 

Some people will be very disappointed if there is not an ultimate theory, that can be formulated as a finite number of principles. I used to 
belong to that camp, but I have changed my mind. 

— Godel and the end of physics [1 1 \ July 20, 2002 

Jurgen Schmidhuber (1997) has argued against this view; he points out that Godel's theorems are irrelevant for 
computable physics. In 2000, Schmidhuber explicitly constructed limit-computable, deterministic universes 
whose pseudo-randomness based on undecidable, Godel-like halting problems is extremely hard to detect but does 
not at all prevent formal TOEs describable by very few bits of information J ^ 

Related critique was offered by Solomon FefermanJ 15 ^ among others. Douglas S. Robertson offers Conway's game 
of life as an example: ^ The underlying rules are simple and complete, but there are formally undecidable questions 
about the game's behaviors. Analogously, it may (or may not) be possible to completely state the underlying rules of 
physics with a finite number of well-defined laws, but there is little doubt that there are questions about the behavior 
of physical systems which are formally undecidable on the basis of those underlying laws. 

Since most physicists would consider the statement of the underlying rules to suffice as the definition of a "theory of 
everything", these researchers argue that Godel's Theorem does not mean that a TOE cannot exist. On the other 
hand, the physicists invoking Godel's Theorem appear, at least in some cases, to be referring not to the underlying 
rules, but to the understandability of the behavior of all physical systems, as when Hawking mentions arranging 
blocks into rectangles, turning the computation of prime numbers into a physical question. 1 This definitional 
discrepancy may explain some of the disagreement among researchers. 

Another approach to working with the limits of logic implied by Godel's incompleteness theorems is to abandon the 

n 8i 

attempt to model reality using a formal system altogether. Process Physics is a notable example of a candidate 
TOE that takes this approach, where reality is modeled using self-organizing (purely semantic) information. 

Fundamental limits in accuracy 

No physical theory to date is believed to be precisely accurate. Instead, physics has proceeded by a series of 
"successive approximations" allowing more and more accurate predictions over a wider and wider range of 
phenomena. Some physicists believe that it is therefore a mistake to confuse theoretical models with the true nature 
of reality, and hold that the series of approximations will never terminate in the "truth". Einstein himself expressed 
this view on occasions J 1 9 ^ On this view, we may reasonably hope for a theory of everything which self-consistently 
incorporates all currently known forces, but should not expect it to be the final answer. 

On the other hand it is often claimed that, despite the apparently ever-increasing complexity of the mathematics of 
each new theory, in a deep sense associated with their underlying gauge symmetry and the number of fundamental 
physical constants, the theories are becoming simpler. If so, the process of simplification cannot continue 
indefinitely. 



Supertheory 



79 



Lack of fundamental laws 

There is a philosophical debate within the physics community as to whether a theory of everything deserves to be 
called the fundamental law of the universe P 0 ^ One view is the hard reductionist position that the TOE is the 
fundamental law and that all other theories that apply within the universe are a consequence of the TOE. Another 
view is that emergent laws (called "free floating laws" by Steven Weinberg), which govern the behavior of complex 
systems, should be seen as equally fundamental. Examples are the second law of thermodynamics and the theory of 
natural selection. The point being that, although in our universe these laws describe systems whose behaviour could 
("in principle") be predicted from a TOE, they would also hold in universes with different low-level laws, subject 
only to some very general conditions. Therefore it is of no help, even in principle, to invoke low-level laws when 
discussing the behavior of complex systems. Some argue that this attitude would violate Occam's Razor if a 
completely valid TOE were formulated. It is not clear that there is any point at issue in these debates (e.g., between 
Steven Weinberg and Philip Anderson) other than the right to apply the high-status word "fundamental" to their 
respective subjects of interest. 

Impossibility of being "of everything" 

Although the name "theory of everything" suggests the determinism of Laplace's quotation, this gives a very 
misleading impression. Determinism is frustrated by the probabilistic nature of quantum mechanical predictions, by 
the extreme sensitivity to initial conditions that leads to mathematical chaos, and by the extreme mathematical 
difficulty of applying the theory. Thus, although the current standard model of particle physics "in principle" predicts 
all known non-gravitational phenomena, in practice only a few quantitative results have been derived from the full 
theory (e.g., the masses of some of the simplest hadrons), and these results (especially the particle masses which are 
most relevant for low-energy physics) are less accurate than existing experimental measurements. The true TOE 
would almost certainly be even harder to apply. The main motive for seeking a TOE, apart from the pure intellectual 
satisfaction of completing a centuries-long quest, is that all prior successful unifications have predicted new 
phenomena, some of which (e.g., electrical generators) have proved of great practical importance. As in other cases 
of theory reduction, the TOE would also allow us to confidently define the domain of validity and residual error of 
low-energy approximations to the full theory which could be used for practical calculations. 

Theory of everything and philosophy 

The philosophical implication of a physical TOE are frequently debated. For example, if physicalism is true, a 
physical TOE will coincide with a philosophical theory of everything. Some philosophers (Aristotle, Plato, Hegel, 
Whitehead, et al.) have attempted to construct all-encompassing systems. Others are highly dubious about the very 
possibility of such an exercise. 

Stephen Hawking wrote in A Brief History of Time that even if we had a TOE, it would necessarily be a set of 
equations. He wrote, "What is it that breathes fire into the equations and makes a universe for them to describe?" P ^ 

While on his deathbed, Einstein still explored equations that he imagined to be candidates of a unified theory. Of 
course, the question would then be "why those equations?" One possible solution might be to adopt the point of view 
of ultimate ensemble, or modal realism, and say that those equations are not unique. Others doubt that the theory of 
everything will be in the form of equations at all. 



Supertheory 



80 



References 

[I] Ellis, John (2002). "Physics gets physical (correspondence)". Nature 415: 957. 

[2] Ellis, John (1986). "The Superstring: Theory of Everything, or of Nothing?". Nature 323: 595-598. doi:10.1038/323595a0. 
[3] Shapin, Steven (1996). The Scientific Revolution. University of Chicago Press. ISBN 0226750213. 

[4] Dirac, P.A.M. (1929). "Quantum mechanics of many-electron systems". Proceedings of the Royal Society of London A 123: 714. 
doi: 10. 1098/rspa. 1929.0094. 

[5] Faraday, M. (1850). "Experimental Researches in Electricity. Twenty-Fourth Series. On the Possible Relation of Gravity to Electricity". 

Abstracts of the Papers Communicated to the Royal Society of London 5: 994-995. doi: 10.1098/rspl. 1843. 0267. 
[6] Pais (1982), Ch. 17. 
[7] Weinberg (1993), Ch. 5 

[8] Potter, Franklin (15 February 2005). "Leptons And Quarks In A Discrete Spacetime" (http://www.sciencegems.com/discretespace.pdf). 

Frank Potter's Science Gems. . Retrieved 2009-12-01. 
[9] Jaki, S.L. (1966). The Relevance of Physics. Chicago Press. 

[10] Stanley L. Jaki (2004) " A Late Awakening to Godel in Physics (http://www.sljaki.com/JakiGodel.pdf)," p. 8-9. 

[II] http://www.damtp.cam.ac.uk/strings02/dirac/hawking/ 

[12] Schmidhuber, Jtirgen (1997). A Computer Scientist's View of Life, the Universe, and Everything. Lecture Notes in Computer Science (http:// 

www.idsia.ch/~juergen/everything/). Springer, pp. 201-208. doi:10.1007/BFb0052071. ISBN 978-3-540-63746-2. . 
[13] Schmidhuber, Jtirgen (2000). "Algorithmic Theories of Everything". arXiv:quant-ph/001 1 122 [quant-ph]. 

[14] Schmidhuber, Jtirgen (2002). "Hierarchies of generalized Kolmogorov complexities and nonenumerable universal measures computable in 

the limit". International Journal of Foundations of Computer Science 13 (4): 587-612. doi:10.1142/S0129054102001291. 
[15] Feferman, Solomon (17 November 2006). "The nature and significance of Godel's incompleteness theorems" (http://math.stanford.edu/ 

-feferman/papers/Godel-IAS.pdf). Institute for Advanced Study. . Retrieved 2009-01-12. 
[16] Robertson, Douglas S. (2007). "Goedel's Theorem, the Theory of Everything, and the Future of Science and Mathematics". Complexity 5: 

22-27. doi:10.1002/1099-0526(200005/06)5:5<22::AID-CPLX4>3.0.CO;2-0. 
[17] Hawking, Stephen (20 July 2002). "Godel and the end of physics" (http://www.damtp.cam.ac.uk/strings02/dirac/hawking/). . 

Retrieved 2009-12-01. 

[18] Cahill, Reginald (2003). "Process Physics" (http://www.ctr4process.org/publications/ProcessStudies/PSS/ 

2003-5-CahillR-Process_Physics.shtml). Process Studies Supplement. Center for Process Studies, pp. 1—131. . Retrieved 2009-07-14. 
[19] Einstein, letter to Felix Klein, 1917. (On determinism and approximations.) Quoted in Pais (1982), Ch. 17. 
[20] Weinberg (1993), Ch 2. 

[21] as quoted in [Artigas, The Mind of the Universe, p. 123] 

• John D. Barrow, Theories of Everything: The Quest for Ultimate Explanation (OUP, Oxford, 1990) ISBN 
0-099-98380-X 

• Stephen Hawking 'The Theory of Everything: The Origin and Fate of the Universe' is an unauthorized 2002 book 
taken from recorded lectures (ISBN 1-893224-79-1) 

• Stanley Jaki OSB, 2005. The Drama of Quantities. Real View Books (ISBN 1-892548-47-X) 

• Abraham Pais Subtle is the Lord...: The Science and the Life of Albert Einstein (OUP, Oxford, 1982). ISBN 
0-19-853907-X 

• John Thompson "Nature's Watchmaker: The Undiscovered Miracle of Time". (Blackhall Publishing Ltd. Ireland, 
2009) ISBN 1842181742 (http://natureswatchmaker.com) 

• Steven Weinberg Dreams of a Final Theory: The Search for the Fundamental Laws of Nature (Hutchinson 
Radius, London, 1993) ISBN 0-09-1773954 

External links 

• The Elegant Universe-Nova online (http://www.pbs.org/wgbh/nova/elegant/program.html) — a 3 hour PBS 
show about the search for the Theory of everything and string theory. 

• 'Theory of Everything' (http://www.vega.org.Uk/video/programme/7) Freeview video by the Vega Science 
Trust and the BBC/OU. 



81 



Quantum Algebra 



Quantum algebra 

Quantum algebra is one of the top-level mathematics categories used by the arXiv. 
Subjects include: 

• Quantum groups 

• Skein theories 

• Operadic algebra 

• Diagrammatic algebra 

• Quantum field theory 

External links 

• Quantum algebra at arxiv.org ^ 

References 

[ 1 ] http :// arxi v . org/list/ math . Q A/ current 

Lie algebra 

In mathematics, a Lie algebra (pronounced /' li :/ ("lee"), not /'lai/ ("lye")) is an algebraic structure whose main use is 
in studying geometric objects such as Lie groups and differentiable manifolds. Lie algebras were introduced to study 
the concept of infinitesimal transformations. The term "Lie algebra" (after Sophus Lie) was introduced by Hermann 
Weyl in the 1930s. In older texts, the name "infinitesimal group" is used. 

Definition and first properties 

A Lie algebra is a vector space gover some field F together with a binary operation [•, •] 

M : S X fl ->fl 

called the Lie bracket, which satisfies the following axioms: 

• Bilinearity: 

[ax + by,z]= a[x, z] + %, z], [z, ax + by] = a[z, x] + b[z, y] 
for all scalars a, b in F and all elements x, y, z in 0 . 

• Alternating on Q : 

[x, x] = 0 

for all x in 0. This implies anticommutativity, or skew-symmetry (in fact the conditions are equivalent for any 
Lie algebra over any field whose characteristic is not 2): 

for all elements x, y in 0 . 

• The Jacobi identity: 



Lie algebra 



82 



fc, [y, z\] + [y, [z, x\] + [z, [x, y]] = 0 

for all x, y, z in 0 . 

For any associative algebra A with multiplication * , one can construct a Lie algebra L(A). As a vector space, L(A) is 
the same as A. The Lie bracket of two elements of L(A) is defined to be their commutator in A: 

[a, b] = a * b — b * a. 

The associativity of the multiplication * in A implies the Jacobi identity of the commutator in L{A). In particular, the 
associative algebra of n x n matrices over a field F gives rise to the general linear Lie algebra g[ n ( J F).The 
associative algebra A is called an enveloping algebra of the Lie algebra L(A). It is known that every Lie algebra can 
be embedded into one that arises from an associative algebra in this fashion. See universal enveloping algebra. 

Homomorphisms, subalgebras, and ideals 

The Lie bracket is not an associative operation in general, meaning that [[x, y], zjneed not equal [x, [y, z]]. 
Nonetheless, much of the terminology that was developed in the theory of associative rings or associative algebras is 
commonly applied to Lie algebras. A subspace f) C fjthat is closed under the Lie bracket is called a Lie 
subalgebra. If a subspace J C q satisfies a stronger condition that 

then / is called an ideal in the Lie algebra 0.^ A Lie algebra in which the commutator is not identically zero and 
which has no proper ideals is called simple. A homomorphism between two Lie algebras (over the same ground 
field) is a linear map that is compatible with the commutators: 

/:0^0', f([x,y]) = [f(x),f(y)], 

for all elements x and y in 0. As in the theory of associative rings, ideals are precisely the kernels of 
homomorphisms, given a Lie algebra 0and an ideal / in it, one constructs the factor algebra q/I , and the first 
isomorphism theorem holds for Lie algebras. Given two Lie algebras 0and g* , their direct sum is the vector space 
0 © fl' consisting of the pairs (x, x ! \ x £ 0, x ! G 0 ; , with the operation 

[(x,x') 5 (y,y')] = (kvL W,y'])i *>y£& *',y' £ a'- 
Examples 

• Any vector space V endowed with the identically zero Lie bracket becomes a Lie algebra. Such Lie algebras are 
called abelian, cf. below. Any one-dimensional Lie algebra over a field is abelian, by the antisymmetry of the Lie 
bracket. 

• The three-dimensional Euclidean space R with the Lie bracket given by the cross product of vectors becomes a 
three-dimensional Lie algebra. 

• The Heisenberg algebra is a three-dimensional Lie algebra with generators (see also the definition at Generating 
set): 

(010\ /000\ /001\ 

000, y=001, z=000, 
000/ \000/ \000/ 

whose commutation relations are 

It is explicitly exhibited as the space of 3x3 strictly upper- triangular matrices. 

• The subspace of the general linear Lie algebra gl n (F) consisting of matrices of trace zero is a subalgebra, ^ the 
special linear Lie algebra, denoted sl n (F). 



Lie algebra 



83 



• Any Lie group G defines an associated real Lie algebra q = Lie(G) . The definition in general is somewhat 
technical, but in the case of real matrix groups, it can be formulated via the exponential map, or the matrix 
exponent. The Lie algebra 0 consists of those matrices X for which 

exp(£X) G G 

for all real numbers t. The Lie bracket of 0is given by the commutator of matrices. As a concrete example, 
consider the special linear group SL(n,R), consisting of all nxn matrices with real entries and determinant 1. 
This is a matrix Lie group, and its Lie algebra consists of all n x n matrices with real entries and trace 0. 

• The real vector space of dXlnxn skew-hermitian matrices is closed under the commutator and forms a real Lie 
algebra denoted u(n) . This is the Lie algebra of the unitary group U(n). 

• An important class of infinite-dimensional real Lie algebras arises in differential topology. The space of smooth 
vector fields on a differentiable manifold M forms a Lie algebra, where the Lie bracket is defined to be the 
commutator of vector fields. One way of expressing the Lie bracket is through the formalism of Lie derivatives, 
which identifies a vector field X with a first order partial differential operator L acting on smooth functions by 

letting L if) be the directional derivative of the function /in the direction of X. The Lie bracket [X,Y] of two 

x 

vector fields is the vector field defined through its action on functions by the formula: 

L[X,Y]f = Lx(Lyf) - Ly(L X f). 

This Lie algebra is related to the pseudogroup of diffeomorphisms of M. 

• The commutation relations between the x, y, and z components of the angular momentum operator in quantum 
mechanics form a representation of a complex three-dimensional Lie algebra, which is the complexification of the 
Lie algebra so(3) of the three-dimensional rotation group: 

[L x , L y ] = ihL z 
[Lyj L z ] = ihL x 
[L z , LJ = ihLy 

• Kac-Moody algebra is an example of an infinite-dimensional Lie algebra. 

Structure theory and classification 

Every finite-dimensional real or complex Lie algebra has a faithful representation by matrices (Ado's theorem). Lie's 
fundamental theorems describe a relation between Lie groups and Lie algebras. In particular, any Lie group gives 
rise to a canonically determined Lie algebra (concretely, the tangent space at the identity), and conversely, for any 
Lie algebra there is a corresponding connected Lie group (Lie's third theorem). This Lie group is not determined 
uniquely, however, any two connected Lie groups with the same Lie algebra are locally isomorphic, and in 
particular, have the same universal cover. For instance, the special orthogonal group SO(3) and the special unitary 
group SU(2) give rise to the same Lie algebra, which is isomorphic to R with the cross-product, and SU(2) is a 
simply-connected twofold cover of SO(3). Real and complex Lie algebras can be classified to some extent, and this 
is often an important step toward the classification of Lie groups. 

Abelian, nilpotent, and solvable 

Analogously to abelian, nilpotent, and solvable groups, defined in terms of the derived subgroups, one can define 
abelian, nilpotent, and solvable Lie algebras. 

A Lie algebra 0is abelian if the Lie bracket vanishes, i.e. [x,y] = 0, for all x and y in 0. Abelian Lie algebras 
correspond to commutative (or abelian) connected Lie groups such as vector spaces J{ n or tori 7™, and are all of 
the form t™, meaning an ^-dimensional vector space with the trivial Lie bracket. 

A more general class of Lie algebras is defined by the vanishing of all commutators of given length. A Lie algebra 0 
is nilpotent if the lower central series 



Lie algebra 



84 



0 > [0>0] > [[0,0], 0] > [[[0 3 0]»0] 3 0] > 

becomes zero eventually. By Engel's theorem, a Lie algebra is nilpotent if and only if for every u in 0the adjoint 
endomorphism 

ad(it) : g — > g, did(u)v = [it, v] 
is nilpotent. 

More generally still, a Lie algebra 0is said to be solvable if the derived series: 

0> [0,0] > [[0,0], [0,0]] > [[[0,0], [0,0]], [[0,0], [0,0]]] > 

becomes zero eventually. 

Every finite-dimensional Lie algebra has a unique maximal solvable ideal, called its radical. Under the Lie 
correspondence, nilpotent (respectively, solvable) connected Lie groups correspond to nilpotent (respectively, 
solvable) Lie algebras. 

Simple and semisimple 

A Lie algebra is "simple" if it has no non-trivial ideals and is not abelian. A Lie algebra 0is called semisimple if its 
radical is zero. Equivalently, 0is semisimple if it does not contain any non-zero abelian ideals. In particular, a 
simple Lie algebra is semisimple. Conversely, it can be proven that any semisimple Lie algebra is the direct sum of 
its minimal ideals, which are canonically determined simple Lie algebras. 

The concept of semisimplicity for Lie algebras is closely related with the complete reducibility of their 
representations. When the ground field F has characteristic zero, semisimplicity of a Lie algebra 0over F is 
equivalent to the complete reducibility of all finite-dimensional representations of 0-An early proof of this 
statement proceeded via connection with compact groups (Weyl's unitary trick), but later entirely algebraic proofs 
were found. 

Classification 

In many ways, the classes of semisimple and solvable Lie algebras are at the opposite ends of the full spectrum of 
the Lie algebras. The Levi decomposition expresses an arbitrary Lie algebra as a semidirect sum of its solvable 
radical and a semisimple Lie algebra, almost in a canonical way. Semisimple Lie algebras over an algebraically 
closed field have been completely classified through their root systems. The classification of solvable Lie algebras is 
a 'wild' problem, and cannot be accomplished in general. 

Cartan's criterion gives conditions for a Lie algebra to be nilpotent, solvable, or semisimple. It is based on the notion 
of the Killing form, a symmetric bilinear form on 0 defined by the formula 

K(u,v) = tr(ad(w)ad(v)), 
where tr denotes the trace of a linear operator. A Lie algebra 0is semisimple if and only if the Killing form is 
nondegenerate. A Lie algebra 0is solvable if and only if K(q, [0,0]) =0. 

Relation to Lie groups 

Although Lie algebras are often studied in their own right, historically they arose as a means to study Lie groups. 
Given a Lie group, a Lie algebra can be associated to it either by endowing the tangent space to the identity with the 
differential of the adjoint map, or by considering the left-invariant vector fields as mentioned in the examples. This 
association is functorial, meaning that homomorphisms of Lie groups lift to homomorphisms of Lie algebras, and 
various properties are satisfied by this lifting: it commutes with composition, it maps Lie subgroups, kernels, 
quotients and cokernels of Lie groups to subalgebras, kernels, quotients and cokernels of Lie algebras, respectively. 

The functor which takes each Lie group to its Lie algebra and each homomorphism to its differential is a faithful and 
exact functor. This functor is not invertible; different Lie groups may have the same Lie algebra, for example SO(3) 



Lie algebra 



85 



and SU(2) have isomorphic Lie algebras. Even worse, some Lie algebras need not have any associated Lie group. 
Nevertheless, when the Lie algebra is finite-dimensional, there is always at least one Lie group whose Lie algebra is 
the one under discussion, and a preferred Lie group can be chosen. Any finite-dimensional connected Lie group has 
a universal cover. This group can be constructed as the image of the Lie algebra under the exponential map. More 
generally, we have that the Lie algebra is homeomorphic to a neighborhood of the identity. But globally, if the Lie 
group is compact, the exponential will not be injective, and if the Lie group is not connected, simply connected or 
compact, the exponential map need not be surjective. 

If the Lie algebra is infinite-dimensional, the issue is more subtle. In many instances, the exponential map is not even 
locally a homeomorphism (for example, in Dif^S 1 ), one may find diffeomorphisms arbitrarily close to the identity 
which are not in the image of exp). Furthermore, some infinite-dimensional Lie algebras are not the Lie algebra of 
any group. 

The correspondence between Lie algebras and Lie groups is used in several ways, including in the classification of 
Lie groups and the related matter of the representation theory of Lie groups. Every representation of a Lie algebra 
lifts uniquely to a representation of the corresponding connected, simply connected Lie group, and conversely every 
representation of any Lie group induces a representation of the group's Lie algebra; the representations are in one to 
one correspondence. Therefore, knowing the representations of a Lie algebra settles the question of representations 
of the group. As for classification, it can be shown that any connected Lie group with a given Lie algebra is 
isomorphic to the universal cover mod a discrete central subgroup. So classifying Lie groups becomes simply a 
matter of counting the discrete subgroups of the center, once the classification of Lie algebras is known (solved by 
Cartan et al. in the semisimple case). 



Category theoretic definition 

Using the language of category theory, a Lie algebra can be defined as an object A in Vec, the category of vector 
spaces together with a morphism [.,.]: A ® A — > A, where ® refers to the monoidal product of Vec, such that 

• [.,-]<> (id + T A , A )=0 

• [■,■] o ([-,•] ® id) o (id + <7 + <7 2 ) = 0 

where x(a® b) := b ® a and a is the cyclic permutation braiding (id ® x ) ° (x <E> id). In diagrammatic form: 



A A A A 




A A 
A AAA AAA AA 




A A A 



Lie algebra 



86 



Notes 

[1] Due to the anticommutativity of the commutator, the notions of a left and right ideal in a Lie algebra coincide. 
[2] Humphreys p. 2 

References 

• Hall, Brian C. Lie Groups, Lie Algebras, and Representations: An Elementary Introduction, Springer, 2003. ISBN 
0-387-40122-9 

• Erdmann, Karin & Wildon, Mark. Introduction to Lie Algebras, 1st edition, Springer, 2006. ISBN 1-84628-040-0 

• Humphreys, James E. Introduction to Lie Algebras and Representation Theory, Second printing, revised. 
Graduate Texts in Mathematics, 9. Springer- Verlag, New York, 1978. ISBN 0-387-90053-5 

• Jacobson, Nathan, Lie algebras, Republication of the 1962 original. Dover Publications, Inc., New York, 1979. 
ISBN 0-486-63832-4 

• Kac, Victor G. et al. Course notes for MIT 18.745: Introduction to Lie Algebras, http://www-math.mit.edu/ 
~lesha/7451ec/ 

• O'Connor, J. J. & Robertson, E.F. Biography of Sophus Lie, MacTutor History of Mathematics Archive, http:// 
www-history . mc s . st- and. ac . uk/B iographies/Lie . html 

• O'Connor, J. J. & Robertson, E.F. Biography of Wilhelm Killing, MacTutor History of Mathematics Archive, 
http :// www-history . mcs . st-and. ac . uk/B iographies/Killing . html 

• Steeb, W.-H. Continuous Symmetries, Lie Algebras, Differential Equations and Computer Algebra, second 
edition, World Scientific, 2007, ISBN 978-981-270-809-0 

• Varadarajan, V. S. Lie Groups, Lie Algebras, and Their Representations, 1st edition, Springer, 2004. ISBN 
0-387-90969-9 

Lie group 

In mathematics, a Lie group (pronounced /' li I/: similar to "Lee") is a group which is also a differentiable manifold, 
with the property that the group operations are compatible with the smooth structure. Lie groups are named after 
Sophus Lie, who laid the foundations of the theory of continuous transformation groups. 

Lie groups represent the best-developed theory of continuous symmetry of mathematical objects and structures, 
which makes them indispensable tools for many parts of contemporary mathematics, as well as for modern 
theoretical physics. They provide a natural framework for analysing the continuous symmetries of differential 
equations (Differential Galois theory), in much the same way as permutation groups are used in Galois theory for 
analysing the discrete symmetries of algebraic equations. An extension of Galois theory to the case of continuous 
symmetry groups was one of Lie's principal motivations. 



Lie group 



87 



Overview 



zw 


w 

Y 




0 )l 







The circle of center 0 and radius 1 in the complex 
plane is a Lie group with complex multiplication. 



Lie groups are smooth manifolds and, therefore, can be studied using 
differential calculus, in contrast with the case of more general 
topological groups. One of the key ideas in the theory of Lie groups, 
from Sophus Lie, is to replace the global object, the group, with its 
local or linearized version, which Lie himself called its "infinitesimal 
group" and which has since become known as its Lie algebra. 

Lie groups play an enormous role in modern geometry, on several 
different levels. Felix Klein argued in his Erlangen program that one 
can consider various "geometries" by specifying an appropriate 
transformation group that leaves certain geometric properties invariant. 
Thus Euclidean geometry corresponds to the choice of the group E(3) 
of distance-preserving transformations of the Euclidean space R , 
conformal geometry corresponds to enlarging the group to the 
conformal group, whereas in projective geometry one is interested in 

the properties invariant under the projective group. This idea later led to the notion of a G-structure, where G is a Lie 
group of "local" symmetries of a manifold. On a "global" level, whenever a Lie group acts on a geometric object, 
such as a Riemannian or a symplectic manifold, this action provides a measure of rigidity and yields a rich algebraic 
structure. The presence of continuous symmetries expressed via a Lie group action on a manifold places strong 
constraints on its geometry and facilitates analysis on the manifold. Linear actions of Lie groups are especially 
important, and are studied in representation theory. 

In the 1940s-1950s, Ellis Kolchin, Armand Borel and Claude Chevalley realised that many foundational results 
concerning Lie groups can be developed completely algebraically, giving rise to the theory of algebraic groups 
defined over an arbitrary field. This insight opened new possibilities in pure algebra, by providing a uniform 
construction for most finite simple groups, as well as in algebraic geometry. The theory of automorphic forms, an 
important branch of modern number theory, deals extensively with analogues of Lie groups over adele rings; p-adic 
Lie groups play an important role, via their connections with Galois representations in number theory. 



Definitions and examples 

A real Lie group is a group which is also a finite-dimensional real smooth manifold, and in which the group 
operations of multiplication and inversion are smooth maps. Smoothness of the group multiplication 

(i : G X G —> G fi(xj y) = xy 
means that \i is a smooth mapping of the product manifold GxG into G. These two requirements can be combined to 
the single requirement that the mapping 

be a smooth mapping of the product manifold into G. 



Lie group 



88 



First examples 

• The 2x2 real invertible matrices form a group under multiplication, denoted by GL 2 (R): 

GL 2 (R) = f^A = ^ ^ : det A = ad - be ^ 0 

This is a four-dimensional noncompact real Lie group. This group is disconnected; it has two connected 
components corresponding to the positive and negative values of the determinant. 

• The rotation matrices form a subgroup of GL 2 (R), denoted by S0 2 (R). It is a Lie group in its own right: 
specifically, a one-dimensional compact connected Lie group which is diffeomorphic to the circle. Using the 
rotation angle ^ as a parameter, this group can be parametrized as follows: 

S0 2 (R) = {( COS(P - sin ^W^/27rz). 
v ' y ysm (f cos if J 1 J 

Addition of the angles corresponds to multiplication of the elements of S0 2 (R), and taking the opposite angle 
corresponds to inversion. Thus both multiplication and inversion are differentiable maps. 

• The orthogonal group also forms an interesting example of a Lie group. 

All of the previous examples of Lie groups fall within the class of classical groups 

Related concepts 

A complex Lie group is defined in the same way using complex manifolds rather than real ones (example: SL 2 (C)), 
and similarly one can define a p-adic Lie group over the /?-adic numbers. Hilbert's fifth problem asked whether 
replacing differentiable manifolds with topological or analytic ones can yield new examples. The answer to this 
question turned out to be negative: in 1952, Gleason, Montgomery and Zippin showed that if G is a topological 
manifold with continuous group operations, then there exists exactly one analytic structure on G which turns it into a 
Lie group (see also Hilbert-Smith conjecture). If the underlying manifold is allowed to be infinite dimensional (for 
example, a Hilbert manifold) then one arrives at the notion of an infinite-dimensional Lie group. It is possible to 
define analogues of many Lie groups over finite fields, and these give most of the examples of finite simple groups. 

The language of category theory provides a concise definition for Lie groups: a Lie group is a group object in the 
category of smooth manifolds. This is important, because it allows generalization of the notion of a Lie group to Lie 
supergroups. 



More examples of Lie groups 

Lie groups occur in abundance throughout mathematics and physics. Matrix groups or algebraic groups are (roughly) 
groups of matrices (for example, orthogonal and symplectic groups), and these give most of the more common 
examples of Lie groups. 

Examples 

• Euclidean space with ordinary vector addition as the group operation becomes an ^-dimensional noncompact 
abelian Lie group. 

• The circle group S 1 consisting of angles mod lit under addition or, alternately, the complex numbers with 

absolute value 1 under multiplication. This is a one-dimensional compact connected abelian Lie group. 

2 

• The group GL^(R) of invertible matrices (under matrix multiplication) is a Lie group of dimension n , called the 
general linear group. It has a closed connected subgroup SL^(R), the special linear group, consisting of matrices 
of determinant 1 which is also a Lie group. 

• The orthogonal group O^(R), consisting of d\\nxn orthogonal matrices with real entries is an n(n - 
l)/2-dimensional Lie group. This group is disconnected, but it has a connected subgroup SO (R) of the same 



Lie group 



89 



dimension consisting of orthogonal matrices of determinant 1, called the special orthogonal group (for n = 3, the 
rotation group). 

• The Euclidean group E^(R) is the Lie group of all Euclidean motions, i.e., isometric affine maps, of 
^-dimensional Euclidean space R". 

• The unitary group U(n) consisting of n x n unitary matrices (with complex entries) is a compact connected Lie 

2 2 
group of dimension n . Unitary matrices of determinant 1 form a closed connected subgroup of dimension n - 1 

denoted S\J(n), the special unitary group. 

• Spin groups are double covers of the special orthogonal groups, used for studying fermions in quantum field 
theory (among other things). 

• The symplectic group Sp (R) consists of all 2n x 2n matrices preserving a nondegenerate skew- symmetric 

In 2 

bilinear form on R (the symplectic form). It is a connected Lie group of dimension 2n + n. The fundamental 
group of the symplectic group is Z and this fact is related to the theory of Maslov index. 

• The 3 -sphere S forms a Lie group by identification with the set of quaternions of unit norm, called versors. The 
only other spheres that admit the structure of a Lie group are the 0-sphere S° (real numbers with absolute value 1) 
and the circle S 1 (complex numbers with absolute value 1). For example, for even n > 1, S n is not a Lie group 

because it does not admit a nonvanishing vector field and so a fortiori cannot be parallelizable as a differentiable 

0 13 7 

manifold. Of the spheres only S , S , S , and S are parallelizable. The latter carries the structure of a Lie 
quasigroup (a nonassociative group), which can be identified with the set of unit octonions. 

• The group of upper triangular n by n matrices is a solvable Lie group of dimension nin + l)/2. 

• The Lorentz group and the Poincare group are the groups of linear and affine isometries of the Minkowski space 
(interpreted as the spacetime of the special relativity). They are Lie groups of dimensions 6 and 10. 

• The Heisenberg group is a connected nilpotent Lie group of dimension 3, playing a key role in quantum 
mechanics. 

• The group U(l)xSU(2)xSU(3) is a Lie group of dimension 1+3+8=12 that is the gauge group of the Standard 
Model in particle physics. The dimensions of the factors correspond to the 1 photon + 3 vector bosons + 8 gluons 
of the standard model. 

• The (3 -dimensional) metaplectic group is a double cover of SL 2 (R) playing an important role in the theory of 
modular forms. It is a connected Lie group that cannot be faithfully represented by matrices of finite size, i.e., a 
nonlinear group. 

• The exceptional Lie groups of types F 4? E , E , E^ have dimensions 14, 52, 78, 133, and 248. There is also a 
group E 71/2 of dimension 190. 

Constructions 

There are several standard ways to form new Lie groups from old ones: 

• The product of two Lie groups is a Lie group. 

• Any topologically closed subgroup of a Lie group is a Lie group. This is known as Cartan's theorem. 

• The quotient of a Lie group by a closed normal subgroup is a Lie group. 

• The universal cover of a connected Lie group is a Lie group. For example, the group R is the universal cover of 
the circle group S 1 . In fact any covering of a differentiable manifold is also a differentiable manifold, but by 
specifying universal cover, one guarantees a group structure (compatible with its other structures). 



Lie group 



90 



Related notions 

Some examples of groups that are not Lie groups (except in the trivial sense that any group can be viewed as a 
0-dimensional Lie group, with the discrete topology), are: 

• Infinite dimensional groups, such as the additive group of an infinite dimensional real vector space. These are not 
Lie groups as they are not finite dimensional manifolds 

• Some totally disconnected groups, such as the Galois group of an infinite extension of fields, or the additive group 
of the /?-adic numbers. These are not Lie groups because their underlying spaces are not real manifolds. (Some of 
these groups are "/?-adic Lie groups"). In general, only topological groups having similar local properties to R n for 
some positive integer n can be Lie groups (of course they must also have a differentiable structure) 

Early history 

According to the most authoritative source on the early history of Lie groups (Hawkins, p. 1), Sophus Lie himself 
considered the winter of 1873-1874 as the birth date of his theory of continuous groups. Hawkins, however, 
suggests that it was "Lie's prodigious research activity during the four-year period from the fall of 1869 to the fall of 
1873" that led to the theory's creation (ibid). Some of Lie's early ideas were developed in close collaboration with 
Felix Klein. Lie met with Klein every day from October 1869 through 1872: in Berlin from the end of October 1869 
to the end of February 1870, and in Paris, Gottingen and Erlangen in the subsequent two years (ibid, p. 2). Lie stated 
that all of the principal results were obtained by 1884. But during the 1870s all his papers (except the very first note) 
were published in Norwegian journals, which impeded recognition of the work throughout the rest of Europe (ibid, 
p. 76). In 1884 a young German mathematician, Friedrich Engel, came to work with Lie on a systematic treatise to 
expose his theory of continuous groups. From this effort resulted the three- volume Theorie der 
Transformations gruppen, published in 1888, 1890, and 1893. 

Lie's ideas did not stand in isolation from the rest of mathematics. In fact, his interest in the geometry of differential 
equations was first motivated by the work of Carl Gustav Jacobi, on the theory of partial differential equations of 
first order and on the equations of classical mechanics. Much of Jacobi's work was published posthumously in the 
1860s, generating enormous interest in France and Germany (Hawkins, p. 43). Lie's idee fixe was to develop a theory 
of symmetries of differential equations that would accomplish for them what Evariste Galois had done for algebraic 
equations: namely, to classify them in terms of group theory. Lie and other mathematicians showed that the most 
important equations for special functions and orthogonal polynomials tend to arise from group theoretical 
symmetries. Additional impetus to consider continuous groups came from ideas of Bernhard Riemann, on the 
foundations of geometry, and their further development in the hands of Klein. Thus three major themes in 19th 
century mathematics were combined by Lie in creating his new theory: the idea of symmetry, as exemplified by 
Galois through the algebraic notion of a group; geometric theory and the explicit solutions of differential equations 
of mechanics, worked out by Poisson and Jacobi; and the new understanding of geometry that emerged in the works 
of Pliicker, Mobius, Grassmann and others, and culminated in Riemann's revolutionary vision of the subject. 

Although today Sophus Lie is rightfully recognized as the creator of the theory of continuous groups, a major stride 
in the development of their structure theory, which was to have a profound influence on subsequent development of 
mathematics, was made by Wilhelm Killing, who in 1888 published the first paper in a series entitled Die 
Zusammensetzung der stetigen endlichen Transformations gruppen (The composition of continuous finite 
transformation groups) (Hawkins, p. 100). The work of Killing, later refined and generalized by Elie Cartan, led to 
classification of semisimple Lie algebras, Cartan's theory of symmetric spaces, and Hermann Weyl's description of 
representations of compact and semisimple Lie groups using highest weights. 

Weyl brought the early period of the development of the theory of Lie groups to fruition, for not only did he classify 
irreducible representations of semisimple Lie groups and connect the theory of groups with quantum mechanics, but 
he also put Lie's theory itself on firmer footing by clearly enunciating the distinction between Lie's infinitesimal 



Lie group 



91 



groups (i.e., Lie algebras) and the Lie groups proper, and began investigations of topology of Lie groups (Borel 
(2001), ). The theory of Lie groups was systematically reworked in modern mathematical language in a monograph 
by Claude Che valley. 



The concept of a Lie group, and possibilities of classification 

Lie groups may be thought of as smoothly varying families of symmetries. Examples of symmetries include rotation 
about an axis. What must be understood is the nature of 'small' transformations, e.g., rotations through tiny angles, 
that link nearby transformations. The mathematical object capturing this structure is called a Lie algebra (Lie himself 
called them "infinitesimal groups"). It can be defined because Lie groups are manifolds, so have tangent spaces at 
each point. 

The Lie algebra of any compact Lie group (very roughly: one for which the symmetries form a bounded set) can be 
decomposed as a direct sum of an abelian Lie algebra and some number of simple ones. The structure of an abelian 
Lie algebra is mathematically uninteresting (since the Lie bracket is identically zero); the interest is in the simple 
summands. Hence the question arises: what are the simple Lie algebras of compact groups? It turns out that they 
mostly fall into four infinite families, the "classical Lie algebras" A , B , C and D , which have simple descriptions 

J ° n n n n 

in terms of symmetries of Euclidean space. But there are also just five "exceptional Lie algebras" that do not fall into 
any of these families. E_ is the largest of these. 

Properties 

• The diffeomorphism group of a Lie group acts transitively on the Lie group 

• Every Lie group is parallelizable, and hence an orientable manifold (there is a bundle isomorphism between its 
tangent bundle and the product of itself with the tangent space at the identity) 



Types of Lie groups and structure theory 

Lie groups are classified according to their algebraic properties (simple, semisimple, solvable, nilpotent, abelian), 
their connectedness (connected or simply connected) and their compactness. 

• Compact Lie groups are all known: they are finite central quotients of a product of copies of the circle group S l 
and simple compact Lie groups (which correspond to connected Dynkin diagrams). 

• Any simply connected solvable Lie group is isomorphic to a closed subgroup of the group of invertible upper 
triangular matrices of some rank, and any finite dimensional irreducible representation of such a group is 1 
dimensional. Solvable groups are too messy to classify except in a few small dimensions. 

• Any simply connected nilpotent Lie group is isomorphic to a closed subgroup of the group of invertible upper 
triangular matrices with l's on the diagonal of some rank, and any finite dimensional irreducible representation of 
such a group is 1 dimensional. Like solvable groups, nilpotent groups are too messy to classify except in a few 
small dimensions. 

• Simple Lie groups are sometimes defined to be those that are simple as abstract groups, and sometimes defined to 
be connected Lie groups with a simple Lie algebra. For example, SL 2 (R) is simple according to the second 
definition but not according to the first. They have all been classified (for either definition). 

• Semisimple Lie groups are Lie groups whose Lie algebra is a product of simple Lie algebras They are central 
extensions of products of simple Lie groups. 

The identity component of any Lie group is an open normal subgroup, and the quotient group is a discrete group. 
The universal cover of any connected Lie group is a simply connected Lie group, and conversely any connected Lie 
group is a quotient of a simply connected Lie group by a discrete normal subgroup of the center. Any Lie group G 
can be decomposed into discrete, simple, and abelian groups in a canonical way as follows. Write 

G for the connected component of the identity 

con J 



Lie group 



92 



G sol for the largest connected normal solvable subgroup 
G nil for the largest connected normal nilpotent subgroup 
so that we have a sequence of normal subgroups 



1 C G C G , C G C G. 

ml sol con 



Then 

GIG is discrete 

con 

G con /G sol is a central extension of a product of simple connected Lie groups. 

C^so/G^ is abelian. A connected abelian Lie group is isomorphic to a product of copies of R and the circle 
group S . 

G nil /1 is nilpotent, and therefore its ascending central series has all quotients abelian. 

This can be used to reduce some problems about Lie groups (such as finding their unitary representations) to the 
same problems for connected simple groups and nilpotent and solvable subgroups of smaller dimension. 

The Lie algebra associated with a Lie group 

To every Lie group, we can associate a Lie algebra, whose underlying vector space is the tangent space of G at the 
identity element, which completely captures the local structure of the group. Informally we can think of elements of 
the Lie algebra as elements of the group that are "infinitesimally close" to the identity, and the Lie bracket is 
something to do with the commutator of two such infinitesimal elements. Before giving the abstract definition we 
give a few examples: 

• The Lie algebra of the vector space is just R" with the Lie bracket given by 

[A, B] = 0. 

(In general the Lie bracket of a connected Lie group is always 0 if and only if the Lie group is abelian.) 

• The Lie algebra of the general linear group GL^(R) of invertible matrices is the vector space MJJl) of square 
matrices with the Lie bracket given by 

[A, B] - AB - BA. 

If G is a closed subgroup of GL (R) then the Lie algebra of G can be thought of informally as the matrices m of 

n 2 
M (R) such that 1 + em is in G, where 8 is an infinitesimal positive number with 8=0 (of course, no such real 

n T 

number 8 exists). For example, the orthogonal group O (R) consists of matrices A with AA = 1, so the Lie algebra 

T n T 2 

consists of the matrices m with (1 + 8m)(l + 8m) = 1, which is equivalent to m + m =0 because 8=0. 

• Formally, when working over the reals, as here, this is accomplished by considering the limit as 8 — > 0; but the 
"infinitesimal" language generalizes directly to Lie groups over general rings. 

The concrete definition given above is easy to work with, but has some minor problems: to use it we first need to 
represent a Lie group as a group of matrices, but not all Lie groups can be represented in this way, and it is not 
obvious that the Lie algebra is independent of the representation we use. To get round these problems we give the 
general definition of the Lie algebra of any Lie group (in 4 steps): 

1 . Vector fields on any smooth manifold M can be thought of as derivations X of the ring of smooth functions on the 
manifold, and therefore form a Lie algebra under the Lie bracket [X, Y] = XY - YX, because the Lie bracket of any 
two derivations is a derivation. 

2. If G is any group acting smoothly on the manifold M, then it acts on the vector fields, and the vector space of 
vector fields fixed by the group is closed under the Lie bracket and therefore also forms a Lie algebra. 

3. We apply this construction to the case when the manifold M is the underlying space of a Lie group G, with G 
acting on G = M by left translations L (h) = gh. This shows that the space of left invariant vector fields (vector 
fields satisfying L X = X for every h in G, where L # denotes the differential of L ) on a Lie group is a Lie 

8 ft 8ft 8 8 



Lie group 



93 



algebra under the Lie bracket of vector fields. 
4. Any tangent vector at the identity of a Lie group can be extended to a left invariant vector field by left translating 
the tangent vector to other points of the manifold. Specifically, the left invariant extension of an element v of the 
tangent space at the identity is the vector field defined by v A =L t y. This identifies the tangent space T at the 

8 8 e 

identity with the space of left invariant vector fields, and therefore makes the tangent space at the identity into a 
Lie algebra, called the Lie algebra of G, usually denoted by a Fraktur 0. Thus the Lie bracket on 0is given 
explicitly by [v, w] = [v A , w A ]^. 

This Lie algebra 0is finite-dimensional and it has the same dimension as the manifold G. The Lie algebra of G 
determines G up to "local isomorphism", where two Lie groups are called locally isomorphic if they look the same 
near the identity element. Problems about Lie groups are often solved by first solving the corresponding problem for 
the Lie algebras, and the result for groups then usually follows easily. For example, simple Lie groups are usually 
classified by first classifying the corresponding Lie algebras. 

We could also define a Lie algebra structure on 7^ using right invariant vector fields instead of left invariant vector 
fields. This leads to the same Lie algebra, because the inverse map on G can be used to identify left invariant vector 
fields with right invariant vector fields, and acts as -1 on the tangent space 7\ 

The Lie algebra structure on T can also be described as follows: the commutator operation 

(x, y) — > xyx y 

on G x G sends (e, e) to e, so its derivative yields a bilinear operation on TG. This bilinear operation is actually the 
zero map, but the second derivative, under the proper identification of tangent spaces, yields an operation that 
satisfies the axioms of a Lie bracket, and it is equal to twice the one defined through left-invariant vector fields. 

Homomorphisms and isomorphisms 

If G and H are Lie groups, then a Lie-group homomorphism / : G — » H is a smooth group homomorphism. (It is 
equivalent to require only that /be continuous rather than smooth.) The composition of two such homomorphisms is 
again a homomorphism, and the class of all Lie groups, together with these morphisms, forms a category. Two Lie 
groups are called isomorphic if there exists a bijective homomorphism between them whose inverse is also a 
homomorphism. Isomorphic Lie groups are essentially the same; they only differ in the notation for their elements. 

Every homomorphism / : G — » H of Lie groups induces a homomorphism between the corresponding Lie algebras 0 
and f) . The association G > 0is a functor (mapping between categories satisfying certain axioms). 

One version of Ado's theorem is that every finite dimensional Lie algebra is isomorphic to a matrix Lie algebra. For 
every finite dimensional matrix Lie algebra, there is a linear group (matrix Lie group) with this algebra as its Lie 
algebra. So every abstract Lie algebra is the Lie algebra of some (linear) Lie group. 

The global structure of a Lie group is not determined by its Lie algebra; for example, if Z is any discrete subgroup of 
the center of G then G and G/Z have the same Lie algebra (see the table of Lie groups for examples). A connected 
Lie group is simple, semisimple, solvable, nilpotent, or abelian if and only if its Lie algebra has the corresponding 
property. 

If we require that the Lie group be simply connected, then the global structure is determined by its Lie algebra: for 
every finite dimensional Lie algebra 0over F there is a simply connected Lie group G with 0as Lie algebra, unique 
up to isomorphism. Moreover every homomorphism between Lie algebras lifts to a unique homomorphism between 
the corresponding simply connected Lie groups. 



Lie group 



94 



The exponential map 

The exponential map from the Lie algebra M^(R) of the general linear group GL^(R) to GL^(R) is defined by the 
usual power series: 

A 2 A 3 
exp(A) = l + A + — + — + ••• 

for matrices A. If G is any subgroup of GL^(R), then the exponential map takes the Lie algebra of G into G, so we 
have an exponential map for all matrix groups. 

The definition above is easy to use, but it is not defined for Lie groups that are not matrix groups, and it is not clear 
that the exponential map of a Lie group does not depend on its representation as a matrix group. We can solve both 
problems using a more abstract definition of the exponential map that works for all Lie groups, as follows. 

Every vector v in g determines a linear map from R to 0 taking 1 to v, which can be thought of as a Lie algebra 
homomorphism. Because R is the Lie algebra of the simply connected Lie group R, this induces a Lie group 
homomorphism c : R — » G so that 

c(s + t) = c(s)c(t) 

for all s and t. The operation on the right hand side is the group multiplication in G. The formal similarity of this 
formula with the one valid for the exponential function justifies the definition 

exp(f) = c(l). 

This is called the exponential map, and it maps the Lie algebra 0into the Lie group G. It provides a diffeomorphism 
between a neighborhood of 0 in 0and a neighborhood of e in G. This exponential map is a generalization of the 
exponential function for real numbers (because R is the Lie algebra of the Lie group of positive real numbers with 
multiplication), for complex numbers (because C is the Lie algebra of the Lie group of non-zero complex numbers 
with multiplication) and for matrices (because M^(R) with the regular commutator is the Lie algebra of the Lie group 
GL^(R) of all invertible matrices). 

Because the exponential map is surjective on some neighbourhood N of e, it is common to call elements of the Lie 
algebra infinitesimal generators of the group G. The subgroup of G generated by N is the identity component of G. 

The exponential map and the Lie algebra determine the local group structure of every connected Lie group, because 
of the Baker-Campbell-Hausdorff formula: there exists a neighborhood U of the zero element of 0, such that for u, 
v in U we have 

exp(w) exp(v) = exp(w + v + 1/2 [u, v] + 1/12 [[u, v], v] - 1/12 [[u, v], u] - ...) 

where the omitted terms are known and involve Lie brackets of four or more elements. In case u and v commute, this 
formula reduces to the familiar exponential law exp(w) exp(v) = exp(w + v). 

The exponential map from the Lie algebra to the Lie group is not always onto, even if the group is connected (though 
it does map onto the Lie group for connected groups that are either compact or nilpotent). For example, the 
exponential map of SL 2 (R) is not surjective. 

Infinite dimensional Lie groups 

Lie groups are finite dimensional by definition, but there are many groups that resemble Lie groups, except for being 
infinite dimensional. There is very little "general theory" of such groups, but some of the examples that have been 
studied include: 

• The group of diffeomorphisms of a manifold. Quite a lot is known about the group of diffeomorphisms of the 
circle. Its Lie algebra is (more or less) the Witt algebra, which has a central extension called the Virasoro algebra, 
used in string theory and conformal field theory. Very little is known about the diffeomorphism groups of 
manifolds of larger dimension. The diffeomorphism group of spacetime sometimes appears in attempts to 
quantize gravity. 



Lie group 



95 



• The group of smooth maps from a manifold to a finite dimensional Lie group is called a gauge group (with 
operation of pointwise multiplication), and is used in quantum field theory and Donaldson theory. If the manifold 
is a circle these are called loop groups, and have central extensions whose Lie algebras are (more or less) 
Kac-Moody algebras. 

• There are infinite dimensional analogues of general linear groups, orthogonal groups, and so on. One important 
aspect is that these may have simpler topological properties: see for example Kuiper's theorem. 

• Just as calculus in finite-dimensional real vector spaces can be extended to calculus in Banach spaces, the 
definition of finite-dimensional smooth manifolds can be extended to give a definition of Banach analytic 
manifolds. Similarly, the standard finite-dimensional definition of Lie groups can be extended to give a definition 
of Banach analytic Lie groups. In this case, we have a Banach analytic manifold which simultaneously has a 
group structure such that multiplication and inversion are analytic maps. Some of the theorems of 
finite-dimensional Lie groups do not carry over to the Banach analytic case, and in particular the relation between 
Lie groups and Lie algebras is much more subtle in the infinite dimensional case. However, it is true that "for 
infinite dimensional Lie groups modeled on Banach spaces there is a well-developed theory ... which is closely 
parallel to the theory of finite dimensional Lie groups.' 

Notes 

[1] Sigurdur Helgason, "Differential Geometry, Lie Groups, and Symmetric Spaces", Academic Press, 1978, page 131. 
[2] Andrew Pressley and Graeme Segal, Loop Groups, Oxford Science Publications, 1986, page 26. 

References 

• Adams, John Frank (1969), Lectures on Lie Groups, Chicago Lectures in Mathematics, Chicago: Univ. of 
Chicago Press, ISBN 0-226-00527-5. 

• Borel, Armand (2001), "Essays in the history of Lie groups and algebraic groups", History of Mathematics 
(American Mathematical Society) 21, ISBN 0-8218-0288-7. 

• Bourbaki, Nicolas, Elements of mathematics: Lie groups and Lie algebras. Chapters 1-3 ISBN 3-540-64242-0, 
Chapters 4-6 ISBN 3-540-42650-7, Chapters 7-9 ISBN 3-540-43405-4 

• Chevalley, Claude (1946), Theory of Lie groups, Princeton: Princeton University Press, ISBN 0-691-04990-4. 

• Fulton, William; Harris, Joe (1991), Representation theory. A first course, Graduate Texts in Mathematics, 
Readings in Mathematics, 129, New York: Springer- Verlag, MR1 153249, ISBN 978-0-387-97527-6, 
ISBN 978-0-387-97495-8 

• Hall, Brian C. (2003), Lie Groups, Lie Algebras, and Representations: An Elementary Introduction, Springer, 
ISBN 0-387-40122-9. 

• Hawkins, Thomas (2000), Emergence of the theory of Lie groups, Springer, ISBN 0-387-98963-3 

• Knapp, Anthony W. (2002), Lie Groups Beyond an Introduction, Progress in Mathematics, 140 (2nd ed.), Boston: 
Birkhauser, ISBN 0-8176-4259-5. 

• Rossmann, Wulf (2001), Lie Groups: An Introduction Through Linear Groups, Oxford Graduate Texts in 
Mathematics, Oxford University Press, ISBN 978-0198596837. The 2003 reprint corrects several typographical 
mistakes. 

• Serre, Jean-Pierre (1965), Lie Algebras and Lie Groups: 1964 Lectures given at Harvard University, Lecture 
notes in mathematics, 1500, Springer, ISBN 3-540-55008-9. 

• Steeb, Willi-Hans (2007), Continuous Symmetries, Lie algebras, Differential Equations and Computer Algebra: 
second edition, World Scientific Publishing, ISBN 981-270-809-X. 



Hopf algebra 



96 



Hopf algebra 

In mathematics, a Hopf algebra, named after Heinz Hopf, is a structure that is simultaneously a (unital associative) 
algebra, a (counital coassociative) coalgebra, with these structures compatible making it a bialgebra, and moreover is 
equipped with an antiautomorphism satisfying a certain property. 

Hopf algebras occur naturally in algebraic topology, where they originated and are related to the H-space concept, in 
group scheme theory, in group theory (via the concept of a group ring), and in numerous other places, making them 
probably the most familiar type of bialgebra. Hopf algebras are also studied in their own right, with much work on 
specific classes of examples on the one hand and classification problems on the other. 



Formal definition 

Formally, a Hopf algebra is a bialgebra H over a field K together with a TT-linear map S\ H 
antipode) such that the following diagram commutes: 

S(g>id 



H (called the 




id<g>S 



Here A is the comultiplication of the bialgebra, V its multiplication, n its unit and 8 its counit. In the sumless 
Sweedler notation, this property can also be expressed as 

S r (c( 1 ))c( 2 ) = C(!)5(c( 2 )) = e(c)l for all c G H. 

As for algebras, one can replace the underlying field K with a commutative ring R in the above definition. 

The definition of Hopf algebra is self-dual (as reflected in the symmetry of the above diagram), so if one can define a 
dual of H (which is always possible if H is finite-dimensional), then it is automatically a Hopf algebra. 



Properties of the antipode 

The antipode S is sometimes required to have a 7^-linear inverse, which is automatic in the finite-dimensional case, or 
if H is commutative or cocommutative (or more generally quasitriangular). 

In general, S is an antihomomorphism,^ so S^is a homomorphism, which is therefore an automorphism if S was 
invertible (as may be required). 

If g 2 — Jd , then the Hopf algebra is said to be involutive (and the underlying algebra with involution is a 
*-algebra). If H is finite-dimensional semisimple over a field of characteristic zero, commutative, or cocommutative, 
then it is involutive. 

If a bialgebra B admits an antipode S, then S is unique ("a bialgebra admits at most 1 Hopf algebra structure"). 
The antipode is an analog to the inversion map on a group that sends 9 to g~ 1 ^ 



Hopf algebra 



97 



Hopf subalgebras 

A subalgebra K (not to be confused with the Field K in the notation above) of a Hopf algebra H is a Hopf subalgebra 
if it is a subcoalgebra of H and the antipode S maps K into K. In other words, a Hopf subalgebra K is a Hopf algebra 
in its own right when the multiplication, comultiplication, counit and antipode of H is restricted to K (and 
additionally the identity 1 is required to be in K). The Nichols-Zoeller Freeness theorem established (in 1989) that 
either natural K-module H is free of finite rank if H is finite dimensional: a generalization of Lagrange's theorem for 
subgroups. As a corollary of this and integral theory, a Hopf subalgebra of a semisimple finite dimensional Hopf 
algebra is automatically semisimple. 

A Hopf subalgebra K is said to be right normal in a Hopf algebra H if it satisfies the condition of stability, 
ad r (h)(K) C K for all h in H, where the right adjoint mapping ad T is defined by 
ad T (h)(k) = iS , (/i( 1 ))fc/i(2)f° r a U k in K, h in H. Similarly, a Hopf subalgebra K is left normal in H if it is stable 
under the left adjoint mapping defined by ad£(h)(k) = h^kS(h^2)) • The two conditions of normality are 

equivalent if the antipode S is bijective, in which case K is said to be a normal Hopf subalgebra. 

A normal Hopf subalgebra K in H satisfies the condition (of equality of subsets of H): = K + H wnere 

denotes the kernel of the counit on K. This normality condition implies that HK^~ 1 ^ a Hopf ideal of H (i.e. an 

algebra ideal in the kernel of the counit, a coalgebra coideal and stable under the antipode). As a consequence one 

has a quotient Hopf algebra H/HK^~znd epimorphism H —> H/K^~H , a theory analogous to that of normal 

Mi 

subgroups and quotient groups in group theory. 

Examples 

Group algebra. Suppose G is a group. The group algebra KG is a unital associative algebra over K. It turns into a 
Hopf algebra if we define 

• A : KG —> KG ® KG by A(g) = g ® g for all g in G 

• 8 : KG Kby e(g) = 1 for all g in G 

• S : KG — » KG by S(g) = g~ l for all g in G. 

Functions on a finite group. Suppose now that G is a finite group. Then the set of all functions from G to K with 
pointwise addition and multiplication is a unital associative algebra over K, and is naturally isomorphic to 

(for G infinite, is a proper subset of K° xG ). The set K° becomes a Hopf algebra if we define 

• A : K° -> K° xG by A(f)(x,y) =f(xy) for all /in K° and all x,y in G 

—> Kby e(f) =f(e) for every /in [here e is the identity element of G] 

• S : K° -> K° by S(f)(x) =f(x~ l ) for all/in K° and all x in G. 

Note that functions on a finite group can be identified with the group ring, though these are more naturally thought of 
as dual - the group ring consists of finite sums of elements, and thus pairs with functions on the group by evaluating 
the function on the summed elements. 

Regular functions on an algebraic group. Generalizing the previous example, we can use the same formulas to 
show that for a given algebraic group G over K, the set of all regular functions on G forms a Hopf algebra. 

Universal enveloping algebra. Suppose g is a Lie algebra over the field K and U is its universal enveloping algebra. 
U becomes a Hopf algebra if we define 

• A : U — » U® Uby A(x) = x®l + l®xfor every x in g (this rule is compatible with commutators and can 
therefore be uniquely extended to all of U). 

• 8 : U — » K by z(x) = 0 for all x in g (again, extended to U) 

• S : U —> U by S(x) = -x for all x in g. 



Hopf algebra 



98 



Cohomology of Lie groups 

The cohomology algebra of a Lie group is a Hopf algebra: the multiplication is provided by the cup-product, and the 
comultiplication 

H*{G) -> H*(G xG) = H*(G) g> H*(G) 

by the group multiplication G X G — > G • This observation was actually a source of the notion of Hopf algebra. 
Using this structure, Hopf proved a structure theorem for the cohomology algebra of Lie groups. 
Theorem (Hopf) 1 ^ Let A be a finite-dimensional, graded commutative, graded cocommutative Hopf algebra over a 
field of characteristic 0. Then A (as an algebra) is a free exterior algebra with generators of odd degree. 

Quantum groups and non-commutative geometry 

All examples above are either commutative (i.e. the multiplication is commutative) or co-commutative (i.e. A = J 1 o 
A where T: H ® H — » H ® H is defined by T(x ® y) = y ® x). Other interesting Hopf algebras are certain 
"deformations" or "quantizations" of those from example 3 which are neither commutative nor co-commutative. 
These Hopf algebras are often called quantum groups, a term that is so far only loosely defined. They are important 
in noncommutative geometry, the idea being the following: a standard algebraic group is well described by its 
standard Hopf algebra of regular functions; we can then think of the deformed version of this Hopf algebra as 
describing a certain "non-standard" or "quantized" algebraic group (which is not an algebraic group at all). While 
there does not seem to be a direct way to define or manipulate these non-standard objects, one can still work with 
their Hopf algebras, and indeed one identifies them with their Hopf algebras. Hence the name "quantum group". 

Related concepts 

Graded Hopf algebras are often used in algebraic topology: they are the natural algebraic structure on the direct sum 
of all homology or cohomology groups of an H-space. 

Locally compact quantum groups generalize Hopf algebras and carry a topology. The algebra of all continuous 
functions on a Lie group is a locally compact quantum group. 

Quasi-Hopf algebras are generalizations of Hopf algebras, where coassociativity only holds up to a twist. 

Weak Hopf algebras, or quantum groupoids, are generalizations of Hopf algebras. Like Hopf algebras, weak Hopf 
algebras form a self-dual class of algebras; i.e., if H is a (weak) Hopf algebra, so is JJ* , the dual space of linear 
forms on H (with respect to the algebra-coalgebra structure obtained from the natural pairing with H and its 
coalgebra- algebra structure). 

A weak Hopf algebra H is usually taken to be a 1) finite dimensional algebra and coalgebra with coproduct 
A : H — > H ® H and counit e : H ^ k satisfying all the axioms of Hopf algebra except possibly 
A(l) 7^ 1 ® lor e(afe) ^ e(a)e(b)for some a,b in H. Instead one requires that 
(A(l) ® 1)(1 ® A(l)) = (A(l) ® 1)(1 ® A(l)) = (A <8> Id)A(l)and 
e[abc) = y^e(ab(i))e(b( 2 ) c) = e(ab( 2 ))e(b( 1 )c) for all a,b, and c in H. 

2) H has a weakened antipode S : H — » H satisfying the axioms (a) - (c): (a) S{a^)a^) = l(i)e(al(2))f° r 
all a in H (the right-hand side is the interesting projection usually denoted by 11^ (a) or e s (a)with image a 
separable subalgebra denoted by JJ R or H s ); (b) a^S{a^) = f(l(i)a)l(2)for all a in H (another interesting 
projection usually denoted by II jC/ (a)or ^ (a) with image a separable algebra jj L ov i/ t , anti-isomorphic to 
jj L Vm S); and (c) S{a^i^a^2)S{a^ = S{a)for all a in H. Note that if A(l) = 1 ® 1, these conditions 

^Sii^^W^attl^t^^WIlM ftg mitig^f ef «tfggt#te^% rigid tensor category. The unit H-module is the 
separable algebra ff L mentioned above. 



Hopf algebra 



99 



For example, a finite groupoid algebra is a weak Hopf algebra. In particular, the groupoid algebra on [n] with one 
pair of invertible arrows &ij and ^between i and j in [n] is isomorphic to the algebra H of n x n matrices. The 
weak Hopf algebra structure on this particular H is given by coproduct A(e^j) = e^- ® , counit 6(e^*) = 1 
and antipode S{e.ij) = Cji . The separable subalgebras jj L znd ff R coincide and are non-central commutative 

algebras in this particular case (the subalgebra of diagonal matrices). 

Early theoretical contributions to weak Hopf algebras are to be found in ^ as well as ^ 

Hopf algebroids introduced by J.-H. Lu in 1996 as a result on work on groupoids in Poisson geometry (later shown 
equivalent in nontrivial way to a construction of Takeuchi from the 1970s and another by Xu around the year 2000): 
Hopf algebroids generalize weak Hopf algebras and certain skew Hopf algebras. They may be loosely thought of as 
Hopf algebras over a noncommutative base ring, where weak Hopf algebras become Hopf algebras over a separable 
algebra. It is a theorem that a Hopf algebroid satisfying a finite projectivity condition over a separable algebra is a 
weak Hopf algebra, and conversely a weak Hopf algebra H is a Hopf algebroid over its separable subalgebra ff L . 
The antipode axioms have been changed by G. Bohm and K. Szlachanyi (J. Algebra) in 2004 for tensor categorical 
reasons and to accommodate examples associated to depth two Frobenius algebra extensions. 

A left Hopf algebroid (H,R) is a left bialgebroid together with an antipode: the bialgebroid (H,R) consists of a total 
algebra H and a base algebra R and two mappings, an algebra homomorphism s : R — > H called a source map, an 
algebra anti-homomorphism t : R — > H called a target map, such that the commutativity condition 
s(ri)t(r2) = t(r2)s(ri)is satisfied for all /"i, € R • The axioms resemble those of a Hopf algebra but are 
complicated by the possibility that R is a noncommutative algebra or its images under s and t are not in the center of 
H. In particular a left bialgebroid (H,R) has an R-R-bimodule structure on H which prefers the left side as follows: 
7*1 • h • T2 = s(ri)t{r2)h for all h in H, 7*1 5 7*2 G R • There is a coproduct A : H — > H H and counit 
e ; H —> R mat make (i?, i?, A, e)an R-coring (with axioms like that of a coalgebra such that all mappings are 
R-R-bimodule homomorphisms and all tensors over R). Additionally the bialgebroid (H,R) must satisfy 
A (aft) = A (a) A (6) for all a,b in H, and a condition to make sure this last condition makes sense: every image 
point A(a) satisfies a^t(r) (g) a^ 2 ) = ® a^)s(r) for all r in R. Also A(l) = 1 ® 1. The counit is 

W k ^o^ is S fH^ll ^iuW 0 ^^^)^ ftek(^aFto^^)ktisfying conditions of 
exchanging the source and target maps and satisfying two axioms like Hopf algebra antipode axioms; see the 
references in Lu or in Bohm- Szlachanyi for a more example-category friendly, though somewhat more complicated, 
set of axioms for the antipode S. The latter set of axioms depend on the axioms of a right bialgebroid as well, which 
are a straightforward switching of left to right, s with t, of the axioms for a left bialgebroid given above. 

As an example of left bialgebroid, take R to be any algebra over a field k. Let H be its algebra of linear 

self-mappings. Let s(r) be left multication by r on R; let t(r) be right multiplication by r on R. H is a left bialgebroid 

over R, which may be seen as follows. From the fact that H ®ij H = Honife(i2 ® i? 5 i?)one may define a 

coproduct by A(/)(r® u) — f(ru) for each linear transformation f from R to itself and all r,u in R. 

Coassociativity of the coproduct follows from associativity of the product on R. A counit is given by = f(l) 

. The counit axioms of a coring follow from the identity element condition on multiplication in R. The reader will be 

amused, or at least edified, to check that (H,R) is a left bialgebroid. In case R is an Azumaya algebra, in which case 

H is isomorphic to R tensor R, an antipode comes from transposing tensors, which makes H a Hopf algebroid over R. 
Multiplier Hopf algebras introduced by Alfons Van Daele in 1994^ are generalizations of Hopf algebras where 

comultiplication from an algebra (with or withthout unit) to the multiplier algebra of tensor product algebra of the 

algebra with itself. 

Hopf group- (co)algebras introduced by V.G.Turaev in 2000 are also generalizations of Hopf algebras. 



Hopf algebra 



100 



Analogy with groups 

Groups can be axiomatized by the same diagrams (equivalently, operations) as a Hopf algebra, where G is taken to 
be a set instead of a module. In this case: 

• the field K is replaced by the 1 -point set 

• there is a natural counit (map to 1 point) 

• there is a natural comultiplication (the diagonal map) 

• the unit is the identity element of the group 

• the multiplication is the multiplication in the group 

• the antipode is the inverse 

rm 

In this philosophy, a group can be thought of as a Hopf algebra over the "field with one element". 

See also 

• Quasitriangular Hopf algebra 

• Algebra/set analogy 

• Representation theory of Hopf algebras 

• Ribbon Hopf algebra 

• Superalgebra 

• Supergroup 

• Anyonic Lie algebra 

Notes 

[1] Dascalescu, Nastasescu & Raianu (2001), Prop. 4.2.6, p. 153 (http://books.google.com/books ?id=pBJ6sbPHA0IC&pg=PA153&dq="is+ 
an+ antimorphism+ of + algebras " ) 

[2] Dascalescu, Nastasescu & Raianu (2001), Remarks 4.2.3, p. 151 (http://books.google.com/books ?id=pBJ6sbPHA0IC&pg=PA151& 

dq="the+antipode+is+unique") 
[3] Quantum groups lecture notes (http://www.mathematik.uni-muenchen.de/~pareigis/Vorlesungen/QuantGrp/ln2_l.pdf) 
[4] S. Montgomery, Hopf algebras and their actions on rings, Conf. Board in Math. Sci. vol. 82, A.M.S., 1993. ISBN 0-8218-0738-2 
[5] Hopf, 1941. 

[6] Gabriella Bohm, Florian Nill, Kornel Szlachanyi. J. Algebra 221 (1999), 385-438 

[7] Dmitri Nikshych, Leonid Vainerman, in: New direction in Hopf algebras, S. Montgomery and H.-J. Schneider, eds., M.S.R.I. Publications, 

vol. 43, Cambridge, 2002, 211-262. 
[8] Alfons Van Daele. Multiplier Hopf algebras (http://www.ams.org/tran/1994-342-02/S0002-9947-1994-1220906-5/ 

S0002-9947-1994-1220906-5.pdf), Transactions of the American Mathematical Society 342(2) (1994) 917-932 
[9] Group = Hopf algebra « Secret Blogging Seminar (http://sbseminar.wordpress.com/2007/10/07/group-hopf-algebra/), Group objects and 

Hopf algebras (http://www.youtube.com/watch?v=p3kkm5dYH-w), video of Simon Willerton. 

References 

• Dascalescu, Sorin; Nastasescu, Constantin; Raianu, §erban (2001), Hopf Algebras, Pure and Applied 
Mathematics, 235 (1st ed.), Marcel Dekker, ISBN 0-8247-0481-9. 

• Pierre Cartier, A primer of Hopf algebras (http://inc.web.ihes.fr/prepub/PREPRINTS/2006/M/M-06-40. 
pdf), IHES preprint, September 2006, 81 pages 

• Jurgen Fuchs, Affine Lie Algebras and Quantum Groups, (1992), Cambridge University Press. ISBN 
0-521-48412-X 

• H. Hopf, Uber die Topologie der Gruppen-Mannigfaltigkeiten und ihrer Verallgemeinerungen, Ann. of Math. 42 
(1941), 22-52. Reprinted in Selecta Heinz Hopf, pp. 119-151, Springer, Berlin (1964). MR4784 

• Street, Ross (2007), Quantum groups, Australian Mathematical Society Lecture Series, 19, Cambridge University 
Press, MR2294803, ISBN 978-0-521-69524-4; 978-0-521-69524-4. 



Quantum group 



101 



Quantum group 

In mathematics and theoretical physics, the term quantum group denotes various kinds of noncommutative algebra 
with additional structure. In general, a quantum group is some kind of Hopf algebra. There is no single, 
all-encompassing definition, but instead a family of broadly similar objects. 

The term "quantum group" often denotes a kind of noncommutative algebra with additional structure that first 
appeared in the theory of quantum integrable systems, and which was then formalized by Vladimir Drinfel'd and 
Michio Jimbo as a particular class of Hopf algebra. The same term is also used for other Hopf algebras that deform 
or are close to classical Lie groups or Lie algebras, such as a v bicrossproduct' class of quantum groups introduced by 
Shahn Majid a little after the work of Drinfeld and Jimbo. 

In Drinfeld's approach, quantum groups arise as Hopf algebras depending on an auxiliary parameter q or h, which 
become universal enveloping algebras of a certain Lie algebra, frequently semisimple or affine, when q = 1 or h = 0. 
Closely related are certain dual objects, also Hopf algebras and also called quantum groups, deforming the algebra of 
functions on the corresponding semisimple algebraic group or a compact Lie group. 

Just as groups often appear as symmetries, quantum groups act on many other mathematical objects and it has 
become fashionable to introduce the adjective quantum in such cases; for example there are quantum planes and 
quantum Grassmannians. 

Intuitive meaning 

The discovery of quantum groups was quite unexpected, since it was known for a long time that compact groups and 
semisimple Lie algebras are "rigid" objects, in other words, they cannot be "deformed". One of the ideas behind 
quantum groups is that if we consider a structure that is in a sense equivalent but larger, namely a group algebra or a 
universal enveloping algebra, then a group or enveloping algebra can be "deformed", although the deformation will 
no longer remain a group or enveloping algebra. More precisely, deformation can be accomplished within the 
category of Hopf algebras that are not required to be either commutative or cocommutative. One can think of the 
deformed object as an algebra of functions on a "noncommutative space", in the spirit of the noncommutative 
geometry of Alain Connes. This intuition, however, came after particular classes of quantum groups had already 
proved their usefulness in the study of the quantum Yang-Baxter equation and quantum inverse scattering method 
developed by the Leningrad School (Ludwig Faddeev, Leon Takhtajan, Evgenii Sklyanin, Nicolai Reshetikhin and 
Korepin) and related work by the Japanese School. 1 ^ The intuition behind the second, bicrossproduct, class of 
quantum groups was different and came from the search for self-dual objects as an approach to quantum gravity 1 J . 

Drinfel'd- Jimbo type quantum groups 

One type of objects commonly called a "quantum group" appeared in the work of Vladimir Drinfel'd and Michio 
Jimbo as a deformation of the universal enveloping algebra of a semisimple Lie algebra or, more generally, a 
Kac-Moody algebra, in the category of Hopf algebras. The resulting algebra has additional structure, making it into a 
quasitriangular Hopf algebra. 

Let A = (a^)be the Cartan matrix of the Kac-Moody algebra, and let q be a nonzero complex number distinct 

from 1, then the quantum group, U q {G) , where G is the Lie algebra whose Cartan matrix is A, is defined as the 

unital associative algebra with generators k\ (where \ is an element of the weight lattice, i.e. 

2(A, cu) I (o^, OLi) € Z for all /), and and /• (for simple roots, oti ), subject to the following relations: 
• k 0 = 1, 



Quantum group 



102 



ki — jfc" 1 

[ e ii fj] = $ij ~' 
ft - ft 



. V (-l) n - ai3 'k' ! e n e ■ e 1 ~ aij ~ n = 0,fori^j, 



where = fc^, ^ = qhip^w) , [0] g J = 1, [n] g J = [m] gi for all positive integers n, and 

m=l 

q™ - q 7™> 

\rri\ q . = — .These are the q-factorial and q-number, respectively, the q-analogs of the ordinary factorial. 

<li ~ Qi 

The last two relations above are the g-Serre relations, the deformations of the Serre relations. 

In the limit as q —> 1 , these relations approach the relations for the universal enveloping algebra U [G) , where 
k\ — k_\ 

kx —> 1 and > i A as q — > 1 , where the element, t\ , of the Cartan subalgebra satisfies 

q-q- 1 

(t^ : h) = A(/i) for all h in the Cartan subalgebra. 

There are various coassociative coproducts under which these algebras are Hopf algebras, for example, 

• Ai(fc A ) = k x ® fc A , Ai( ei ) = 1 ® e; + e { ® fc i? Ai(/0 = fc," 1 ® £ + £ ® 1, 

• A 2 (fc A ) = k x ®k x , A 2 (ei) = fcr 1 <g> + e< <g> 1 , A 2 (/ i ) = 1 ® £ + £ ® A* , 

• A 3 (fc A ) = k\ k\, A 3 (ei) = k;* ® e, + e, ® fc/> A 3 (£) = fc^ ® + /i ® fc?' where the 

set of generators has been extended, if required, to include fc A for X which is expressible as the sum of an 

element of the weight lattice and half an element of the root lattice. 
In addition, any Hopf algebra leads to another with reversed copproduct T o A » where J 1 is given by 

T[x ® y) = y ® x , giving three more possible versions. 

The counit on U q (A)is the same for all these coproducts: e(/c A ) = 1, e(e^) = 0, e(fi) = 0, and the 
respective antipodes for the above coproducts are given by 

• S'i(fc A ) = fc_ A , 5i(ci) = -ejfcr 1 , Si(/i) = 

• S 2 (fc A ) = fc-A 5 -Safe) = -kie h S 2 (fi) = -fiK 1 , 

• S 3 (k x ) = fc_ A , 5 3 (ei) = -fcei, 5 3 (/i) = -g x rl /i- 

Alternatively, the quantum group [/ g (G)can be regarded as an algebra over the field C(q) , the field of all 
rational functions of an indeterminate q over C • 

Similarly, the quantum group [/ g (G)can be regarded as an algebra over the field Q(g), the field of all rational 
functions of an indeterminate q over (Q) (see below in the section on quantum groups at q = 0). 

Representation theory 

Just as there are many different types of representations for Kac-Moody algebras and their universal enveloping 
algebras, so there are many different types of representation for quantum groups. 

As is the case for all Hopf algebras, U q {G) has an adjoint representation on itself as a module, with the action being 
given by Ad -2/ = J2 X W yS ( X ^> where A ( x ) = E x (l) ® X W. 

(x) (*) 



Quantum group 



103 



Case 1: q is not a root of unity 

One important type of representation is a weight representation, and the corresponding module is called a weight 
module. A weight module is a module with a basis of weight vectors. A weight vector is a nonzero vector v such that 
k\.v = d\V for all \ , where d\ are complex numbers for all weights \ such that 

• do = l, 

• d\d^ — dx+p , for all weights \ and ft . 

A weight module is called integrable if the actions of and f± are locally nilpotent (i.e. for any vector v in the 
module, there exists a positive integer k, possibly dependent on v, such that e±.V = fj*.v = Ofor all /). In the case 
of integrable modules, the complex numbers d\ associated with a weight vector satisfy d\ = C\q^ X ^ » where v 

is an element of the weight lattice, and C\ are complex numbers such that 

• c 0 = 1, 

• c \ c fi — c A+/i , for all weights \ and ft , 

• C2 ai = 1 for all /. 

Of special interest are highest weight representations, and the corresponding highest weight modules. A highest 
weight module is a module generated by a weight vector v, subject to k\.v = d^-ufor all weights \ , and 
e im y = Ofor all i. Similarly, a quantum group can have a lowest weight representation and lowest weight module, 
i.e. a module generated by a weight vector v, subject to k\.v — d\V for all weights \ , and fc.v = Ofor all /. 
Define a vector v to have weight v if k\.v = q^ X ^v for all \ in the weight lattice. 

If G is a Kac-Moody algebra, then in any irreducible highest weight representation of U q {G) , with highest weight 

v , the multiplicities of the weights are equal to their multiplicities in an irreducible representation of JJ (G) with 

equal highest weight. If the highest weight is dominant and integral (a weight ft is dominant and integral if ft 

satisfies the condition that 2(/i, a^) / (a^, a^)is a non-negative integer for all /), then the weight spectrum of the 

irreducible representation is invariant under the Weyl group for G, and the representation is intesrable. 
Conversely, if a highest weight module is integrable, then its highest weight vector v satisfies k\.v = C\q^ ,v ^v » 

where C\ are complex numbers such that 

• Q) = 1, 

• c X c fi = c A+/i , for all weights \ and ft , 

• c 2ai — 1 f° r a U U 

and v is dominant and integral. 

As is the case for all Hopf algebras, the tensor product of two modules is another module. For an element x of 
U q (G), and for vectors v and w in the respective modules, x.(v ® w) = A(x).(i; ® w) » so that 
k\.{v ®w) = k\.v <g> fc A .w , and in the case of coproduct Ai, ei.(v ® w) = ki.v <g> e^.iL? + ei.v ® w 
and /-.(i; <g) w) = V <g) f t .w + /^.^ (g) k^.W • 

The integrable highest weight module described above is a tensor product of a one-dimensional module (on which 
k\ = C\ for all A , and a = fi = 0 for all /) and a highest weight module generated by a nonzero vector , 
subject to k\.VQ = q^^VQ^ov all weights \ , and e^.^o = Ofor all L 

In the specific case where G is a finite-dimensional Lie algebra (as a special case of a Kac-Moody algebra), then the 
irreducible representations with dominant integral highest weights are also finite-dimensional. 

In the case of a tensor product of highest weight modules, its decomposition into submodules is the same as for the 
tensor product of the corresponding modules of the Kac-Moody algebra (the highest weights are the same, as are 
their multiplicities). 



Quantum group 



104 



Quasitriangularity 

Case 1: q is not a root of unity 

Strictly, the quantum group U q {G) is not quasitriangular, but it can be thought of as being "nearly quasitriangular" 

in that there exists an infinite formal sum which plays the role of an /^-matrix. This infinite formal sum is expressible 
in terms of generators 6^ and f{ , and Cartan generators t\ , where k\ is formally identified with g** . The 
infinite formal sum is the product of two factors, g 7 ? t^j®^ , and an infinite formal sum, where {A^} is a basis 

for the dual space to the Cartan subalgebra, and {Uj\ is the dual basis, and Tj is a sign (+1 or -1). 

The formal infinite sum which plays the part of the /^-matrix has a well-defined action on the tensor product of two 

irreducible highest weight modules, and also on the tensor product if two lowest weight modules. Specifically, if v 

has weight a and w has weight /3 , then g 7 ?^ *Aj ^ y ^ w ^ _ qVfaP) v ^ w , and the fact that the 

modules are both highest weight modules or both lowest weight modules reduces the action of the other factor on 
v & w to a finite sum. 

Specifically, if V is a highest weight module, then the formal infinite sum, R, has a well-defined, and invertible, 
action on V ®V> and this value of R (as an element of Hom(l^) ® Hom(l^)) satisfies the Yang-Baxter 
equation, and therefore allows us to determine a representation of the braid group, and to define quasi-invariants for 
knots, links and braids. 

Quantum groups at q - 0 

Masaki Kashiwara has researched the limiting behaviour of quantum groups as q — > 0 . 

As a consequence of the defining relations for the quantum group U q (G) , U q (G) can be regarded as a Hopf 
algebra over Q(g) , the field of all rational functions of an indeterminate q over Q . 

For simple root o^and non-negative integer n , define g( n ) — e™ /[n] q . ! an d fj^ = f\p\ q .\ (specifically, 

= fj® = 1). In an integrable module M » an d for weight A , a vector u £ M\ (i.e. a vector u in M 

with weight \ ) can be uniquely decomposed into the sums 
oo oo 

where u n £ ker(e^) H M\+ noii , v n £ ker(/j) D M\- na , u n ^ Oonly if rc + Q ^ > Q, an d 

v n 7^ 0 only if n — 7^—^ — ^ > 0 . Linear mappings &i : M —> M and f . • M — > M can be defined on 
M A by 

OO OO 

r(n-l) (n+1) 



n=l n=0 
oc 00 



71=0 71=1 

Let A be the integral domain of all rational functions in Q(g) which are regular at q = 0(Le. a. rational function 
/(q)is an element of A if an d only if there exist polynomials g(q) and h(q) in the polynomial ring Q[g] such 

that h(0) 7^ 0 , and /(g) = g(q) /h(q) ). A crystal base for M is an ordered pair (L, B) , such that 

• L is a free -submodule of M such that M = <8U 

• B is a Q -basis of the vector space L/qL over Q 5 

• L = @\L\ and B = U X B X , where L A = L H M A and 5 A = 5 n (L A /gL A ) 5 

• e { L C L and c L for all i, 



Quantum group 



105 



• e { B C B U {0} and f { BcBU {0} for all i, 

• for all b € B and b ! G £?, and for all z, = b f if and only if fib* = 6. 

To put this into a more informal setting, the actions of e^/, and are generally singular at 5 = Oon an 
integrable module M • The linear mappings and f. on the module are introduced so that the actions of 
and J^e^ are regular at g = Oon the module. There exists a Q(g) -basis of weight vectors £ for M , with 
respect to which the actions of and ^ are regular at q = 0 for all /. The module is then restricted to the free A 
-module generated by the basis, and the basis vectors, the A -submodule and the actions of and f. are evaluated 
at 5 = 0. Furthermore, the basis can be chosen such that at q — 0 , for all % , and f { are represented by 

edges. Each vertex of the graph represents an 
element of the Q -basis B of L/qL , and a directed edge, labelled by /, and directed from vertex v ito vertex 
V2, represents that fr 2 = ^^(and, equivalently, that b\ — €$2), where fe 1 is the basis element represented by 
Vl, and 6 2 i s me basis element represented by V2. The graph completely determines the actions of and at 
q = 0 . If an integrable module has a crystal base, then the module is irreducible if and only if the graph 
representing the crystal base is connected (a graph is called "connected" if the set of vertices cannot be partitioned 
into the union of nontrivial disjoint subsets l^and V^such that there are no edges joining any vertex in Vjto any 
vertex in V2). 

For any integrable module with a crystal base, the weight spectrum for the crystal base is the same as the weight 
spectrum for the module, and therefore the weight spectrum for the crystal base is the same as the weight spectrum 
for the corresponding module of the appropriate Kac-Moody algebra. The multiplicities of the weights in the crystal 
base are also the same as their multiplicities in the corresponding module of the appropriate Kac-Moody algebra. 

It is a theorem of Kashiwara that every integrable highest weight module has a crystal base. Similarly, every 
integrable lowest weight module has a crystal base. 

Tensor products of crystal bases 

Let Afbe an integrable module with crystal base (L, B) and TVf'be an integrable module with crystal base 
(Z/, B f ). For crystal bases, the coproduct A> given by 

A(fc A ) = k x ® fc A , Afe) = e { <g> kr 1 + 1 ® e h A(fi) = f { ® 1 + k { ® f { , is adopted. The integrable 
module M ®Q( q ) M f has crystal base (L ®^ L* \B <g> B') , where 

B <g> B* = {b ®q b f : b € B, b f € B r } . For a basis vector b G B , define 
6i(b) = max{n > 0 : e"b ^ 0} and 0.(6) = max{n > 0 : f?b ^ 0} • The actions of ^and £ on 

^ b9b )-\bi»S i V i if 0 i (6)<e t (V) l 



W6 ® 6J -\6®/^ if 0,(6) < 6,(6')- 



The decomposition of the product two integrable highest weight modules into irreducible submodules is determined 
by the decomposition of the graph of the crystal base into its connected components (i.e. the highest weights of the 
submodules are determined, and the multiplicity of each highest weight is determined). 



Quantum group 



106 



Compact matrix quantum groups 

See also compact quantum group. 

S.L. Woronowicz introduced compact matrix quantum groups. Compact matrix quantum groups are abstract 
structures on which the "continuous functions" on the structure are given by elements of a C* -algebra. The geometry 
of a compact matrix quantum group is a special case of a noncommutative geometry. 

The continuous complex-valued functions on a compact Hausdorff topological space form a commutative 
C*-algebra. By the Gelfand theorem, a commutative C*-algebra is isomorphic to the C*-algebra of continuous 
complex-valued functions on a compact Hausdorff topological space, and the topological space is uniquely 
determined by the C*-algebra up to homeomorphism. 

For a compact topological group, G, there exists a C*-algebra homomorphism A : G(G) — > G(G) ® G(G) 
(where G(G) ® (7(G) is the C*-algebra tensor product - the completion of the algebraic tensor product of G(G) 
and G(G)), such that A(/)(x ? y) = f{xy) for all / £ G(G), and for all x, y G G (where 
(/ ® 9){ x ? y) — f( x )9{y)^ or a ^ f->9^ G(G) and all x : y £ G ). There also exists a linear multiplicative 
mapping k : G(G) G(G) , such that K (f) ( x ) = f{x~ l ) for all / £ G(G) and all x <E G • Strictly, this 
does not make G(G) a Hopf algebra, unless G is finite. On the other hand, a finite-dimensional representation of G 
can be used to generate a *-subalgebra of G(G) which is also a Hopf *-algebra. Specifically, if ^ i — > {v>ij{g))ij 
is an n -dimensional representation of G > then ^ij £ G(G) for all i, J , and = 5^ w * fc ® ^fejfor all 

i, j . It follows that the *-algebra generated by ^ijfor all i, j and ^(li^for all 2, j is a Hopf *-algebra: the 
counit is determined by e{u{j) = 5^- for all i, j (where (5^- is the Kronecker delta), the antipode is ft , and the 

Mttahb^Wiy.-^^^ - a pair (C, .) , where C is a C-a lg ee,a and 
^ = (l^j*)^ n is a matrix with entries in (7 such that 

• The *-subalgebra, Gq, of (7 » which is generated by the matrix elements of w , is dense in (7 j 

• There exists a C*-algebra homomorphism A : G — > C ® C (where G ® C is the C*-algebra tensor 
product - the completion of the algebraic tensor product of C and G ) such that 

A (^j) = Y^ u ^® u kj 
k 

for all i, j ( A is called the ^multiplication); 

• There exists a linear antimultiplicative map k : Gq — > Gq (the coinverse) such that «(ft(x;*)*) = for all 
u (E Gq and 5^ K ^} l ik) u kj — u ik^{ u kj) — where / is the identity element of C • Since K is 

A; 

antimultiplicative, then k(vw) = ft(tL?)ft(?;) for all v^w G Gq. 
As a consequence of continuity, the comultiplication on G is coassociative. 

In general, G is n °t a bialgebra, and Gois a Hopf *-algebra. 

Informally, G can be regarded as the *-algebra of continuous complex- valued functions over the compact matrix 
quantum group, and u can be regarded as a finite-dimensional representation of the compact matrix quantum group. 
A representation of the compact matrix quantum group is given by a corepresentation of the Hopf * -algebra (a 
corepresentation of a counital coassociative coalgebra A is a square matrix v = (vij)i : j=i jm .. :7l with entries in A 

71 

(so v £ M n (A) ) such that A{v{j) = ^ ® v^jfor all i, j and e(vij) = 5{j for all i, j ). Furthermore, 

fc=l 

a representation v, is called unitary if the matrix for v is unitary (or equivalently, if K>{vij) — Vji for all ij). 
An example of a compact matrix quantum group is S 11^(2) , where the parameter ft is a positive real number. So 
SU^{2) = (G(S , [/ /i (2), it), where C{SU^ (2)) is the C*-algebra generated by a and 7,subjectto 
77* — 7*7? a 7 — AH^i a 7* = / i 7* a ? aa * + ^7*7 = a * a + / i_1 7*7 — ^> 



Quantum group 



107 



and w = 



( OL T \ 

and u = I ^ # 1 , so that the comultiplication is determined by A (a) = a. ® a. — 7 £5 7*, 

A(7) = a®7 + 7®a*, and the coinverse is determined by k{o) = a*, ^(7) = — A i_1 7' 
^(7*) = —fij* , = Q . Note that u is a representation, but not a unitary representation, w is equivalent 

to the unitary representation v = I * * 1 ■ 

Equivalently, = (0(517^(2)), where C(5C/ /z (2))is the C*-algebra generated by a and j3 

, subject to 

/3fi* = /?*/?, a/3 = fi/3a, a{3* = fif3*a, aa* + (i 2 p*P = a*a + (3* (3 = I, 

^* ,S0 ^ at ^ e comu l t iP^ cat i° n i s determined by A (a) = a ® a — fi/3 ® /?*, 

A(/3) = a®/3 + /?®a*, and the coinverse is determined by k{ql) = a*, = — 

K,(j3*) = —flft* , = a . Note that luis a unitary representation. The realizations can be identified by 

equating 7 = \fji<(3 • 

When fj, = 1, then S'[/ /X (2)is equal to the algebra C(iS'C/(2))of functions on the concrete compact group 

517(2). 

Bicrossproduct quantum groups 

Whereas compact matrix pseudogroups are typically versions of Drinfeld-Jimbo quantum groups in a dual function 
algebra formulation, with additional structure, the bicrossproduct ones are a distinct second family of quantum 
groups of increasing importance as deformations of solvable rather than semisimple Lie groups. They are associated 
to Lie splittings of Lie algebras or local factorisations of Lie groups and can be viewed as the cross product or 
Mackey quantisation of one of the factors acting on the other for the algebra and a similar story for the coproduct A 
with the second factor acting back on the first. The very simplest nontrivial example corresponds to two copies of 
locally acting on each other and results in a quantum group (given here in an algebraic form) with generators 
p, K : K —1 > say, and coproduct 

[p, K] = hK(K -l),Ap = p®K + l®p,AK = K®K 

where h is me deformation parameter. This quantum group was linked to a toy model of Planck scale physics 
implementing Born reciprocity when viewed as a deformation of the Heisenberg algebra of quantum mechanics. 
Also, starting with any compact real form of a semisimple Lie algebra 9 its complexification as a real Lie algebra of 
twice the dimension splits into 9 and a certain solvable Lie algebra (the Iwasawa decomposition), and this provides 
a canonical bicrossproduct quantum group associated to 9 • For su(2) one obtains a quantum group deformation of 
the Euclidean group E(3) of motions in 3 dimensions. 



Notes 

[1] Schwiebert, Christian (1994), Generalized quantum inverse scattering, arXiv:hep-th/9412237v3 
[2] Majid, Shahn (1988), "Hopf algebras for physics at the Planck scale", Classical and Quantum Gravity 5: 1587-1607, 
doi: 10. 1088/0264-9381/5/12/010 



References 

• Podles, P.; Muller, E., Introduction to quantum groups, arXiv:q-alg/97 04002 

• Kassel, Christian (1995), Quantum groups, Graduate Texts in Mathematics, 155, Berlin, New York: 
Springer- Verlag, MR1321145, ISBN 978-0-387-94370-1 

• Majid, Shahn (2002), A quantum groups primer, London Mathematical Society Lecture Note Series, 292, 
Cambridge University Press, MR1904789, ISBN 978-0-521-01041-2 



Quantum group 



108 



• Street, Ross (2007), Quantum groups, Australian Mathematical Society Lecture Series, 19, Cambridge University 
Press, MR2294803, ISBN 978-0-521-69524-4; 978-0-521-69524-4. 

• Majid, Shahn (January 2006), "What Is... a Quantum Group?" (http://www.ams.org/notices/200601/what-is. 
pdf) (PDF), Notices of the American Mathematical Society 53 (1): 30-31, retrieved 2008-01-16 

• Shnider, Steven; Sternberg, Shlomo (1993) Quantum groups. From coalgebras to Drinfel'd algebras. A guided 
tour. Graduate Texts in Mathematical Physics, II. International Press, Cambridge, MA. 

Affine quantum group 

In mathematics, a quantum affine algebra (or affine quantum group) is a Hopf algebra that is a ^-deformation of 
the universal enveloping algebra of an affine Lie algebra. They were introduced independently by Drinfeld (1985) 
and Jimbo (1985) as a special case of their general construction of a quantum group from a Cartan matrix. One of 
their principal applications has been to the theory of solvable lattice models in quantum statistical mechanics, where 
the Yang-Baxter equation occurs with a spectral parameter. Combinatorial aspects of the representation theory of 
quantum affine algebras can be described simply using crystal bases, which correspond to the degenerate case when 
the deformation parameter q vanishes and the Hamiltonian of the associated lattice model can be explicitly 
diagonalized. 

References 

• Drinfeld, V. G. (1985), "Hopf algebras and the quantum Yang-Baxter equation", Doklady Akademii Nauk SSSR 
283 (5): 1060-1064, MR802128, ISSN 0002-3264 

• Drinfeld, V. G. (1987), "A new realization of Yangians and of quantum affine algebras", Doklady Akademii Nauk 
SSSR 296 (1): 13-17, MR914215, ISSN 0002-3264 

• Frenkel, Igor B.; Reshetikhin, N. Yu. (1992), "Quantum affine algebras and holonomic difference equations" ^\ 
Communications in Mathematical Physics 146 (1): 1-60, doi:10.1007/BF02099206, MR1 163666, 

ISSN 0010-3616 

• Jimbo, Michio (1985), "A q-difference analogue of U(g) and the Yang-Baxter equation", Letters in Mathematical 
Physics. 10 (1): 63-69, doi:10.1007/BF00704588, MR797001, ISSN 0377-9017 

• Jimbo, Michio; Miwa, Tetsuji (1995), Algebraic analysis of solvable lattice models, CBMS Regional Conference 
Series in Mathematics, 85, Published for the Conference Board of the Mathematical Sciences, Washington, DC, 
MR1308712, ISBN 978-0-8218-0320-2 



References 

[ 1 ] http://proj ecteuclid. org/getRecord?id=euclid. cmp/ 1 1 04249974 



Affine Lie algebra 



109 



Affine Lie algebra 

In mathematics, an affine Lie algebra is an infinite-dimensional Lie algebra that is constructed in a canonical 
fashion out of a finite-dimensional simple Lie algebra. It is a Kac-Moody algebra whose generalized Cartan matrix 
is positive semi-definite and has corank 1. From purely mathematical point of view, affine Lie algebras are 
interesting because their representation theory, like representation theory of finite dimensional, semisimple Lie 
algebras is much better understood than that of general Kac-Moody algebras. As observed by Victor Kac, the 
character formula for representations of affine Lie algebras implies certain combinatorial identities, the Macdonald 
identities. 

Affine Lie algebras play an important role in string theory and conformal field theory due to the way they are 
constructed: starting from a simple Lie algebra 0, one considers the loop algebra, Lg, formed by the 0 -valued 
functions on a circle (interpreted as the closed string) with pointwise commutator. The affine Lie algebra gis 
obtained by adding one extra dimension to the loop algebra and modifying a commutator in a non-trivial way, which 
physicists call a quantum anomaly. The point of view of string theory helps to understand many deep properties of 
affine Lie algebras, such as the fact that the characters of their representations are given by modular forms. 

Affine Lie algebras from simple Lie algebras 
Construction 

If 0is a finite dimensional simple Lie algebra, the corresponding affine Lie algebra gis constructed as a central 
extension of the infinite-dimensional Lie algebra g (g) C[t, t -1 ] , with one-dimensional center Cc. As a vector 
space, 

0 = A ® C[M _1 ] © Cc, 

where C[f, £ _1 ]is the complex vector space of Laurent polynomials in the indeterminate t. The Lie bracket is 
defined by the formula 

[a ® t n + ac, b®t rn + (3c} = [a, b] <g> i n+m + {a^nS^^c 
for all a, b <E 0, a, (3 G C and n, m £ Z , where [a, b] is the Lie bracket in the Lie algebra 0and (-| ■) is the 
Cartan-Killing form on 0. 

The affine Lie algebra corresponding to a finite-dimensional semisimple Lie algebra is the direct sum of the affine 
Lie algebras corresponding to its simple summands. 

Constructing the Dynkin diagrams 

The Dynkin diagram of each affine Lie algebra consists of that of the corresponding simple Lie algebra plus an 
additional node, which corresponds to the addition of an imaginary root. Of course, such a node cannot be attached 
to the Dynkin diagram in just any location, but for each simple Lie algebra there exists a number of possible 
attachments equal to the cardinality of the group of outer automorphisms of the Lie algebra. In particular, this group 
always contains the identity element, and the corresponding affine Lie algebra is called an untwisted affine Lie 
algebra. When the simple algebra admits automorphisms that are not inner automorphisms, one may obtain other 
Dynkin diagrams and these correspond to twisted affine Lie algebras. 



Affine Lie algebra 



110 



Classifying the central extensions 

The attachment of an extra node to the Dynkin diagram of the corresponding simple Lie algebra corresponds to the 
following construction. An affine Lie algebra can always be constructed as a central extension of the loop algebra of 
the corresponding simple Lie algebra. If one wishes to begin instead with a semisimple Lie algebra, then one needs 
to centrally extend by a number of elements equal to the number of simple components of the semisimple algebra. In 
physics, one often considers instead the direct sum of a semisimple algebra and an abelian algebra . In this case 
one also needs to add n further central elements for the n abelian generators. 

The second integral cohomology of the loop group of the corresponding simple compact Lie group is isomorphic to 
the integers. Central extensions of the affine Lie group by a single generator are topologically circle bundles over 
this free loop group, which are classified by a two-class known as the first Chern class of the fibration. Therefore the 
central extensions of an affine Lie group are classified by a single parameter k which is called the central charge in 
the physics literature, where it first appeared. Unitary highest weight representations of the affine compact groups 
only exist when k is a natural number. More generally, if one considers a semi- simple algebra, there is a central 
charge for each simple component. 

Applications 

They appear naturally in theoretical physics (for example, in conformal field theories such as the WZW model and 
coset models and even on the worldsheet of the heterotic string), geometry, and elsewhere in mathematics. 

References 

• Di Francesco, P.; Mathieu, P.; Senechal, D. (1997), Conformal Field Theory, Springer '-Verlag, ISBN 
0-387-94785-X 

• Fuchs, Jurgen (1992), Affine Lie Algebras and Quantum Groups, Cambridge University Press, ISBN 
0-521-48412-X 

• Goddard, Peter; Olive, David (1988), Kac-Moody and Virasoro algebras: A Reprint Volume for Physicists, 
Advanced Series in Mathematical Physics, 3, World Scientific, ISBN 9971-50-419-7 

• Kac, Victor (1990), Infinite dimensional Lie algebras (3 ed.), Cambridge University Press, ISBN 0-521-46693-8 

• Kohno, Toshitake (1998), Conformal Field Theory and Topology, American Mathematical Society, ISBN 
0-8218-2130-X 

• Pressley, Andrew; Segal, Graeme (1986), Loop groups, Oxford University Press, ISBN 0-19-853535-X 



Quantum affine algebra 



111 



Quantum affine algebra 

In mathematics, a quantum affine algebra (or affine quantum group) is a Hopf algebra that is a ^-deformation of 
the universal enveloping algebra of an affine Lie algebra. They were introduced independently by Drinfeld (1985) 
and Jimbo (1985) as a special case of their general construction of a quantum group from a Cartan matrix. One of 
their principal applications has been to the theory of solvable lattice models in quantum statistical mechanics, where 
the Yang-Baxter equation occurs with a spectral parameter. Combinatorial aspects of the representation theory of 
quantum affine algebras can be described simply using crystal bases, which correspond to the degenerate case when 
the deformation parameter q vanishes and the Hamiltonian of the associated lattice model can be explicitly 
diagonalized. 



References 

• Drinfeld, V. G. (1985), "Hopf algebras and the quantum Yang-Baxter equation", Doklady Akademii Nauk SSSR 
283 (5): 1060-1064, MR802128, ISSN 0002-3264 

• Drinfeld, V. G. (1987), "A new realization of Yangians and of quantum affine algebras", Doklady Akademii Nauk 
SSSR 296 (1): 13-17, MR914215, ISSN 0002-3264 

• Frenkel, Igor B.; Reshetikhin, N. Yu. (1992), "Quantum affine algebras and holonomic difference equations" ^\ 
Communications in Mathematical Physics 146 (1): 1-60, doi:10.1007/BF02099206, MR1 163666, 

ISSN 0010-3616 

• Jimbo, Michio (1985), "A q-difference analogue of U(g) and the Yang-Baxter equation", Letters in Mathematical 
Physics. 10 (1): 63-69, doi:10.1007/BF00704588, MR797001, ISSN 0377-9017 

• Jimbo, Michio; Miwa, Tetsuji (1995), Algebraic analysis of solvable lattice models, CBMS Regional Conference 
Series in Mathematics, 85, Published for the Conference Board of the Mathematical Sciences, Washington, DC, 
MR1308712, ISBN 978-0-8218-0320-2 



Operator algebra 



112 



Operator algebra 

In functional analysis, an operator algebra is an algebra of continuous linear operators on a topological vector space 
with the multiplication given by the composition of mappings. Although it is usually classified as a branch of 
functional analysis, it has direct applications to representation theory, differential geometry, quantum statistical 
mechanics and quantum field theory. 

Such algebras can be used to study arbitrary sets of operators with little algebraic relation simultaneously. From this 
point of view, operator algebras can be regarded as a generalization of spectral theory of a single operator. In general 
operator algebras are non-commutative rings. 

An operator algebra is typically required to be closed in a specified operator topology inside the algebra of the whole 
continuous linear operators. In particular, it is a set of operators with both algebraic and topological closure 
properties. In some disciplines such properties are axiomized and algebras with certain topological structure become 
the subject of the research. 

Though algebras of operators are studied in various contexts (for example, algebras of pseudo-differential operators 
acting on spaces of distributions), the term operator algebra is usually used in reference to algebras of bounded 
operators on a Banach space or, even more specially in reference to algebras of operators on a separable Hilbert 
space, endowed with the operator norm topology. 

In the case of operators on a Hilbert space, the adjoint map on operators gives a natural involution which provides an 
additional algebraic structure which can be imposed on the algebra. In this context, the best studied examples are 
self-adjoint operator algebras, meaning that they are closed under taking adjoints. These include C*-algebras and von 
Neumann algebras. C*-algebras can be easily characterized abstractly by a condition relating the norm, involution 
and multiplication. Such abstractly defined C* -algebras can be identified to a certain closed subalgebra of the 
algebra of the continuous linear operators on a suitable Hilbert space. A similar result holds for von Neumann 
algebras. 

Commutative self-adjoint operator algebras can be regarded as the algebra of complex valued continuous functions 
on a locally compact space, or that of measurable functions on a standard measurable space. Thus, general operator 
algebras are often regarded as a noncommutative generalizations of these algebras, or the structure of the base space 
on which the functions are defined. This point of view is elaborated as the philosophy of noncommutative geometry, 
which tries to study various non-classical and/or pathological objects by noncommutative operator algebras. 

Examples of operator algebras which are not self-adjoint include: 

• nest algebras 

• many commutative subspace lattice algebras 

• many limit algebras 

References 

• Blackadar, Bruce (2005). Operator Algebras: Theory of C*- Algebras and von Neumann Algebras. Encyclopaedia 
of Mathematical Sciences. Springer- Verlag. ISBN 3540284869. 



Clifford algebra 



113 



Clifford algebra 

In mathematics, Clifford algebras are a type of associative algebra. They can be thought of as one of the possible 
generalizations of the complex numbers and quaternions. The theory of Clifford algebras is intimately connected 
with the theory of quadratic forms and orthogonal transformations. Clifford algebras have important applications in a 
variety of fields including geometry and theoretical physics. They are named after the English geometer William 
Kingdon Clifford. 

Introduction and basic properties 

Specifically, a Clifford algebra is a unital associative algebra which contains and is generated by a vector space V 
equipped with a quadratic form Q. The Clifford algebra C£(V,Q) is the "freest" algebra generated by V subject to the 
condition^ 

v 2 = Q(v)l for all vtV. 

If the characteristic of the ground field K is not 2, then one can rewrite this fundamental identity in the form 

uv + vu = 2(ii 5 v) for all v € V, 

where <u, v> = l /2(Q(u + v) - Q(u) - Q(v)) is the symmetric bilinear form associated to Q, via the polarization 
identity. The idea of being the "freest" or "most general" algebra subject to this identity can be formally expressed 
through the notion of a universal property, as done below. 

Quadratic forms and Clifford algebras in characteristic 2 form an exceptional case. In particular, if char K = 2 it is 
not true that a quadratic form determines a symmetric bilinear form, or that every quadratic form admits an 
orthogonal basis. Many of the statements in this article include the condition that the characteristic is not 2, and are 
false if this condition is removed. 

As quantization of exterior algebra 

Clifford algebras are closely related to exterior algebras. In fact, if Q = 0 then the Clifford algebra C£(V,Q) is just the 
exterior algebra A(V). For nonzero Q there exists a canonical linear isomorphism between A(V) and C£(V,Q) 
whenever the ground field K does not have characteristic two. That is, they are naturally isomorphic as vector spaces, 
but with different multiplications (in the case of characteristic two, they are still isomorphic as vector spaces, just not 
naturally). Clifford multiplication is strictly richer than the exterior product since it makes use of the extra 
information provided by Q. 

More precisely, Clifford algebras may be thought of as quantizations (cf. quantization (physics), Quantum group) of 
the exterior algebra, in the same way that the Weyl algebra is a quantization of the symmetric algebra. 

Weyl algebras and Clifford algebras admit a further structure of a *-algebra, and can be unified as even and odd 
terms of a superalgebra, as discussed in CCR and CAR algebras. 

Universal property and construction 

Let V be a vector space over a field K, and let Q : V —> K be a quadratic form on V. In most cases of interest the field 
K is either R, C or a finite field. 

A Clifford algebra C£(V,Q) is a unital associative algebra over K together with a linear map i : V —> C£{V,Q) 

2 

satisfying i(v) = Q(v)l for all v E V, defined by the following universal property: Given any associative algebra A 
over K and any linear map j : V —> A such that 

j( v ) 2 =Q(v)l for alive V 



Clifford algebra 



114 



(where 1 denotes the multiplicative identity of A), there is a unique algebra homomorphism / : C£(V,Q) —> A such 
that the following diagram commutes (i.e. such that/o i = j): 

v — l -+ct(y,Q) 




* 



A 

Working with a symmetric bilinear form < , > instead of Q (in characteristic not 2), the requirement on j is 

j(v)j(w) + j(w)j(v) = 2<v, w> for all v, w £ V. 

A Clifford algebra as described above always exists and can be constructed as follows: start with the most general 
algebra that contains V, namely the tensor algebra T(V), and then enforce the fundamental identity by taking a 
suitable quotient. In our case we want to take the two-sided ideal / in T(V) generated by all elements of the form 

v ® v - Q(v)lfor all y £ V 

and define C£(V,Q) as the quotient 
C£(V,Q) = T(V)/I Q 

It is then straightforward to show that C£(V,Q) contains V and satisfies the above universal property, so that CI is 
unique up to a unique isomorphism; thus one speaks of "the" Clifford algebra C€(V, Q). It also follows from this 
construction that i is injective. One usually drops the i and considers V as a linear subspace of C£(V,Q). 

The universal characterization of the Clifford algebra shows that the construction of C£(V,Q) is functorial in nature. 
Namely, C£ can be considered as a functor from the category of vector spaces with quadratic forms (whose 
morphisms are linear maps preserving the quadratic form) to the category of associative algebras. The universal 
property guarantees that linear maps between vector spaces (preserving the quadratic form) extend uniquely to 
algebra homomorphisms between the associated Clifford algebras. 



Basis and dimension 

If the dimension of V is n and { , . . . ,e } is a basis of V, then the set 

1 n 

{e^e^ - -e ik | 1 < t x < i 2 < ■ ■ - < ik < n and 0 < k < n} 

is a basis for C£(V,Q). The empty product (k = 0) is defined as the multiplicative identity element. For each value of 
k there are n choose k basis elements, so the total dimension of the Clifford algebra is 

dimC*(KQ) = £Q =T. 

Since V comes equipped with a quadratic form, there is a set of privileged bases for V: the orthogonal ones. An 
orthogonal basis is one such that 

(e;,^) =0 i^j. 

where <•,•> is the symmetric bilinear form associated to Q. The fundamental Clifford identity implies that for an 
orthogonal basis 

This makes manipulation of orthogonal basis vectors quite simple. Given a product e^e^ • • • e^ fc of distinct 
orthogonal basis vectors, one can put them into standard order by including an overall sign corresponding to the 
number of flips needed to correctly order them (i.e. the signature of the ordering permutation). 

If the characteristic is not 2 then an orthogonal basis for V exists, and one can easily extend the quadratic form on V 
to a quadratic form on all of C£(V,Q) by requiring that distinct elements e^e^ • * * e^ fc are orthogonal to one another 
whenever the {e.}'s are orthogonal. Additionally, one sets 



Clifford algebra 



115 



Q{e h e i2 • • -e ifc ) = Q{e il )Q{e i2 ) • -Q(e ik ). 



2 

The quadratic form on a scalar is just Q(X) = X . Thus, orthogonal bases for V extend to orthogonal bases for 
C£(V,<2). The quadratic form defined in this way is actually independent of the orthogonal basis chosen (a 
basis-independent formulation will be given later). 

Examples: real and complex Clifford algebras 

The most important Clifford algebras are those over real and complex vector spaces equipped with nondegenerate 
quadratic forms. The geometric interpretation of nondegenerate Clifford algebras is known as geometric algebra. 

Every nondegenerate quadratic form on a finite-dimensional real vector space is equivalent to the standard diagonal 
form: 

Q{v)=vl + --- + vl-vl +l v 2 p+g 

where n = p + q is the dimension of the vector space. The pair of integers (/?, q) is called the signature of the 
quadratic form. The real vector space with this quadratic form is often denoted R p,q . The Clifford algebra on R p,q is 
denoted C€ p ^(R). The symbol CiJJL) means either Cl^ Q (R) or C£ Q ^(R) depending on whether the author prefers 
positive definite or negative definite spaces. 

A standard orthonormal basis {e.} for R p,q consists of n = p + q mutually orthogonal vectors, p of which have norm 
+1 and q of which have norm -1. The algebra C^^(R) will therefore have p vectors which square to +1 and q 
vectors which square to -1. 

Note that C^ 0Q (R) is naturally isomorphic to R since there are no nonzero vectors. Cl^ ^R) is a two-dimensional 
algebra generated by a single vector which squares to -1, and therefore is isomorphic to C, the field of complex 
numbers. The algebra C£ Q2 (R) is a four-dimensional algebra spanned by {1, e^e^}. The latter three elements 

square to -1 and all anticommute, and so the algebra is isomorphic to the quaternions H. The next algebra in the 
sequence is Cl^ 3 (R) is an 8 -dimensional algebra isomorphic to the direct sum H © H called split-biquaternions. 

One can also study Clifford algebras on complex vector spaces. Every nondegenerate quadratic form on a complex 
vector space is equivalent to the standard diagonal form 

Q(z) = zl + 4 + --- + zl 

where n = dim V, so there is essentially only one Clifford algebra in each dimension. We will denote the Clifford 
algebra on C n with the standard quadratic form by C^(C). One can show that the algebra Ci^iC) m ay be obtained as 
the complexification of the algebra C€ p ^(R) where n=p + q: 

ci n {c) = ce Ptq (R) <g> c = ce{c p + q , q ® c). 

Here Q is the real quadratic form of signature (p,q). Note that the complexification does not depend on the signature. 
The first few cases are not hard to compute. One finds that 

C^ 0 (C) = C 

C^(C) = c © c 

C£ 2 (C) = M 2 (C) 

where M (C) denotes the algebra of 2x2 matrices over C. 

It turns out that every one of the algebras CI (R) and C^(C) is isomorphic to a matrix algebra over R, C, or H or 
to a direct sum of two such algebras. For a complete classification of these algebras see classification of Clifford 
algebras. 



Clifford algebra 



116 



Properties 

Relation to the exterior algebra 

Given a vector space V one can construct the exterior algebra A(V), whose definition is independent of any quadratic 
form on V. It turns out that if F does not have characteristic 2 then there is a natural isomorphism between A(V) and 
C£(V,Q) considered as vector spaces (and there exists an isomorphism in characteristic two, which may not be 
natural). This is an algebra isomorphism if and only if Q = 0. One can thus consider the Clifford algebra C£(V,Q) as 
an enrichment (or more precisely, a quantization, cf. the Introduction) of the exterior algebra on V with a 
multiplication that depends on Q (one can still define the exterior product independent of Q). 

The easiest way to establish the isomorphism is to choose an orthogonal basis {e.} for V and extend it to an 
orthogonal basis for C£(V,Q) as described above. The map C£(V,Q) —> A(V) is determined by 
e h e i2 ■••e^e il Ae, 2 A---A e ik . 

Note that this only works if the basis {e.} is orthogonal. One can show that this map is independent of the choice of 
orthogonal basis and so gives a natural isomorphism. 

If the characteristic of K is 0, one can also establish the isomorphism by antisymmetrizing. Define functions : V x 
... xV^C£(V,g)by 

fk(v u ' ' ' ,v k ) = ^ J2 s S n ( a ) Mi) * * * M*) 

" <rES k 

where the sum is taken over the symmetric group on k elements. Since/ is alternating it induces a unique linear map 

k 

A (V) —> C€(V,Q). The direct sum of these maps gives a linear map between A(V) and C£(V,Q). This map can be 
shown to be a linear isomorphism, and it is natural. 

A more sophisticated way to view the relationship is to construct a filtration on C£(V,Q). Recall that the tensor 
algebra T(V) has a natural filtration: F® C F l C F 2 C ... where F k contains sums of tensors with rank < k. Projecting 
this down to the Clifford algebra gives a filtration on C£{V,Q). The associated graded algebra 

Gr F C£(V,Q) = ®F k /F k - 1 
k 

is naturally isomorphic to the exterior algebra A(V). Since the associated graded algebra of a filtered algebra is 
always isomorphic to the filtered algebra as filtered vector spaces (by choosing complements of F k inF k+l for all k), 
this provides an isomorphism (although not a natural one) in any characteristic, even two. 

Grading 

In the following, assume that the characteristic is not 2. 

Clifford algebras are Z 2 -graded algebras (also known as superalgebras). Indeed, the linear map on V defined by 
V — v (reflection through the origin) preserves the quadratic form Q and so by the universal property of Clifford 
algebras extends to an algebra automorphism 

a : C€(V,Q) -> C€(V,Q). 

Since a is an involution (i.e. it squares to the identity) one can decompose C£(V,Q) into positive and negative 
eigenspaces 

Cl{V t Q) = C£°(V, Q) e C£\V, Q) 

where C£ l (V,Q) ={xE C£(V,Q) I a(x) = (-1) l x}. Since a is an automorphism it follows that 

CV{V, Q)C£ j {V, Q) = C£ i+j {V, Q) 
where the superscripts are read modulo 2. This gives Ct(V,Q) the structure of a Z 2 -graded algebra. The subspace 

Q) forms a subalgebra of C£(V,<2), called the even subalgebra. The subspace CI (V 9 Q) is called the odd part 
of C£(V,Q) (it is not a subalgebra). The Z 2 -grading plays an important role in the analysis and application of Clifford 



Clifford algebra 



117 



algebras. The automorphism a is called the main involution or grade involution. 

Remark. In characteristic not 2 the underlying vector space of C£(V,Q) inherits a Z-grading from the canonical 
isomorphism with the underlying vector space of the exterior algebra A(V). It is important to note, however, that this 
is a vector space grading only. That is, Clifford multiplication does not respect the Z-grading, only the Z-grading: 
for instance if Q(y) ^ 0, then y £ C^(V,Q), but y 2 <= C£°(V : Q), not in C£ 2 (V 5 Q). Happily, the 
gradings are related in the natural way: Z 2 = Z/2Z. Further, the Clifford algebra is Z-filtered: 
Ci-^V, Q) • C£- j (V : Q) C Cl- i+j (V, Q) • The degree of a Clifford number usually refers to the degree in 
the Z-grading. Elements which are pure in the Z-grading are simply said to be even or odd. 

The even subalgebra C£°(V,Q) of a Clifford algebra is itself a Clifford algebra^ . If Vis the orthogonal direct sum of 
a vector a of norm Q(a) and a subspace U, then C£°(V,Q) is isomorphic to C£{U-Q{a)Q), where -Q(a)Q is the form 
Q restricted to U and multiplied by -Q{a). In particular over the reals this implies that 

C£° p q (R) = C£ Pjg _i(M) for q > 0, and 

Cf p q {R) ^ C^, p _!(R) for p > 0. 
In the negative-definite case this gives an inclusion C€ n „ (R) C C€ n (R) which extends the sequence 

0,n—l U, n 

RC CCHCH0HC ... 

Likewise, in the complex case, one can show that the even subalgebra of C€ (C) is isomorphic to C€ (C). 
Antiautomorphisms 

In addition to the automorphism a, there are two antiautomorphisms which play an important role in the analysis of 
Clifford algebras. Recall that the tensor algebra T(V) comes with an antiautomorphism that reverses the order in all 
products: 

Vl ® v 2 <8> • • • ® v k h-> v k <g) - - - <g) v 2 <8> Vi- 
Since the ideal / is invariant under this reversal, this operation descends to an antiautomorphism of C£(V,Q) called 
the transpose or reversal operation, denoted by x. The transpose is an antiautomorphism: (xy) 1 = y t x t • The 
transpose operation makes no use of the Z-grading so we define a second antiautomorphism by composing a and 
the transpose. We call this operation Clifford conjugation denoted x 

x = a{x l ) = a(xY. 

Mi 

Of the two antiautomorphisms, the transpose is the more fundamental. 

Note that all of these operations are involutions. One can show that they act as ±1 on elements which are pure in the 
Z-grading. In fact, all three operations depend only on the degree modulo 4. That is, if x is pure with degree k then 

a(x) = ±x x l = ±x x = ±x 
where the signs are given by the following table: 



k mod 4 


0 


1 


2 


3 




a(x) 


+ 




+ 




(-1)" 




+ 


+ 






^_^k(k-l)/2 


X 


+ 






+ 


,_^k(k+l)/2 



Clifford algebra 



118 



The Clifford scalar product 

When the characteristic is not 2 the quadratic form Q on V can be extended to a quadratic form on all of C£(V,Q) as 
explained earlier (which we also denoted by Q). A basis independent definition is 

Q{x) = {x*x) 

where <a> denotes the scalar part of a (the grade 0 part in the Z-grading). One can show that 

Q(viv 2 "'Vk) = Q{vi)Q{v 2 ) • • • Q(v k ) 

where the v. are elements of V — this identity is not true for arbitrary elements of C£(V,Q). 

The associated symmetric bilinear form on C^(V,g) is given by 
(x,y) = (xty. 

One can check that this reduces to the original bilinear form when restricted to V. The bilinear form on all of 
C£(V,Q) is nondegenerate if and only if it is nondegenerate on V. 

It is not hard to verify that the transpose is the adjoint of left/right Clifford multiplication with respect to this inner 
product. That is, 

(ax,y) = (z, a*y),and 
(xa,y) = {x.ya 1 ). 

Structure of Clifford algebras 

In this section we assume that the vector space V is finite dimensional and that the bilinear form of Q is non-singular. 
A central simple algebra over K is a matrix algebra over a (finite dimensional) division algebra with center K. For 
example, the central simple algebras over the reals are matrix algebras over either the reals or the quaternions. 

• If V has even dimension then C£(V,Q) is a central simple algebra over K. 

• If V has even dimension then C£°(V,Q) is a central simple algebra over a quadratic extension of K or a sum of two 
isomorphic central simple algebras over K. 

• If V has odd dimension then C£(V,Q) is a central simple algebra over a quadratic extension of K or a sum of two 
isomorphic central simple algebras over K. 

• If V has odd dimension then C£°(V,Q) is a central simple algebra over K. 

The structure of Clifford algebras can be worked out explicitly using the following result. Suppose that U has even 
dimension and a non- singular bilinear form with discriminant d, and suppose that V is another vector space with a 
quadratic form. The Clifford algebra of U+V is isomorphic to the tensor product of the Clifford algebras of U and 
(-l) dim ^ /2 JV, which is the space V with its quadratic form multiplied by (-l) dim ^ /2 J. Over the reals, this implies 
in particular that 

Cl p+2 , q (R) = M 2 (E) <g> CZ, iP (R) 
Cl p+1 , q+1 {R) = M 2 (R) ® Cl p<q {R) 
Cl Piq+2 (R)=W®Cl qtP (R). 

These formulas can be used to find the structure of all real Clifford algebras; see the classification of Clifford 
algebras. 

Notably, the Morita equivalence class of a Clifford algebra (its representation theory: the equivalence class of the 
category of modules over it) depends only on the signature p — qmod 8. This is an algebraic form of Bott 
periodicity. 



Clifford algebra 



119 



The Clifford group T 

In this section we assume that V is finite dimensional and the quadratic form Q is nondegenerate. 

The invertible elements of the Clifford algebra act on it by twisted conjugation: conjugation by x maps 

y i— » xya(x) -1 . 

The Clifford group T is defined to be the set of invertible elements x that stabilize vectors, meaning that 

xva{x)~ l G V 
for all v in V. 

This formula also defines an action of the Clifford group on the vector space V that preserves the norm Q, and so 
gives a homomorphism from the Clifford group to the orthogonal group. The Clifford group contains all elements r 
of V of nonzero norm, and these act on V by the corresponding reflections that take v to v - <v,r>r/Q(r) (In 
characteristic 2 these are called orthogonal trans vections rather than reflections.) 

The Clifford group T is the disjoint union of two subsets T° and T 1 , where T l is the subset of elements of degree /. 
The subset T° is a subgroup of index 2 in T. 

If V is a finite dimensional real vector space with positive definite (or negative definite) quadratic form then the 
Clifford group maps onto the orthogonal group of V with respect to the form (by the Cartan-Dieudonne theorem) and 
the kernel consists of the nonzero elements of the field K. This leads to exact sequences 

i->jp _>r->o v (jiQ->i 3 
1 -> k* -> r° -> SO v {K) -> 1. 

Over other fields or with indefinite forms, the map is not in general onto, and the failure is captured by the spinor 
norm. 

Spinor norm 

In arbitrary characteristic, the spinor norm Q is defined on the Clifford group by 
Q(x) = x l x. 

It is a homomorphism from the Clifford group to the group K of non-zero elements of K. It coincides with the 
quadratic form Q of V when V is identified with a subspace of the Clifford algebra. Several authors define the spinor 
norm slightly differently, so that it differs from the one here by a factor of -1, 2, or -2 on T 1 . The difference is not 
very important in characteristic other than 2. 

The nonzero elements of K have spinor norm in the group K 2 of squares of nonzero elements of the field K. So 
when V is finite dimensional and non-singular we get an induced map from the orthogonal group of V to the group 
KIK 2 , also called the spinor norm. The spinor norm of the reflection of a vector r has image Q(r) in K IK 2 , and 
this property uniquely defines it on the orthogonal group. This gives exact sequences: 

1 -> {±1} -> Pm v {K) -> O v {K) -> IT/IT 2 , 

1 -> {±1} -> Spin y (K) -> SO v {K) -> K*/K* 2 . 
Note that in characteristic 2 the group {±1 } has just one element. 

From the point of view of Galois cohomology of algebraic groups, the spinor norm is a connecting homomorphism 
on cohomology. Writing \x 2 for the algebraic group of square roots of 1 (over a field of characteristic not 2 it is 
roughly the same as a two-element group with trivial Galois action), the short exact sequence 

1 — > fjb 2 — > Piny — > O v —> 1 

yields a long exact sequence on cohomology, which begins 

1 -> i/°0i 2 ; if) -> #°(Pm y ; K) -» tf°(CV; tf) -> ^(WJ *0- 



Clifford algebra 



120 



The 0th Galois cohomology group of an algebraic group with coefficients in K is just the group of K- valued points: 
H°(G] K) = G(K) , and i? 1 (/i 2 ; K) = K*/K* 2 , which recovers the previous sequence 

1 -> {±1} -> Pm v {K) -> Ov(ff) -> K*/K* 2 , 
where the spinor norm is the connecting homomorphism H°(Oy] K) — » i? 1 (/i2j K). 

Spin and Pin groups 

In this section we assume that V is finite dimensional and its bilinear form is non-singular. (If K has characteristic 2 
this implies that the dimension of Vis even.) 

The Pin group Pin^(K) is the subgroup of the Clifford group T of elements of spinor norm 1 , and similarly the Spin 
group Spiriy(K) is the subgroup of elements of Dickson invariant 0 in Pin^iK). When the characteristic is not 2, these 
are the elements of determinant 1. The Spin group usually has index 2 in the Pin group. 

Recall from the previous section that there is a homomorphism from the Clifford group onto the orthogonal group. 
We define the special orthogonal group to be the image of T 0 . If K does not have characteristic 2 this is just the 
group of elements of the orthogonal group of determinant 1. If K does have characteristic 2, then all elements of the 
orthogonal group have determinant 1 , and the special orthogonal group is the set of elements of Dickson invariant 0. 

There is a homomorphism from the Pin group to the orthogonal group. The image consists of the elements of spinor 
norm 1 £ K IK . The kernel consists of the elements +1 and -1, and has order 2 unless K has characteristic 2. 
Similarly there is a homomorphism from the Spin group to the special orthogonal group of V. 

In the common case when V is a positive or negative definite space over the reals, the spin group maps onto the 
special orthogonal group, and is simply connected when V has dimension at least 3. Further the kernel of this 
homomorphism consists of 1 and -l.So in this case the spin group, Spin(n), is a double cover of SO(n). Please note, 
however, that the simple connectedness of the spin group is not true in general: if V is R p,q for p and q both at least 2 
then the spin group is not simply connected. In this case the algebraic group Spin is simply connected as an 
algebraic group, even though its group of real valued points Spin (R) is not simply connected. This is a rather 
subtle point, which completely confused the authors of at least one standard book about spin groups. 

Spinors 

Clifford algebras CD (C), with p+q=2n even, are matrix algebras which have a complex representation of 

n ^ 

dimension 2 . By restricting to the group Pin^ (R) we get a complex representation of the Pin group of the same 
dimension, called the spin representation. If we restrict this to the spin group Spin (R) then it splits as the sum of 

n-l 

two half spin representations (or Weyl representations) of dimension 2 . 

If p+q=2n+l is odd then the Clifford algebra CD (C) is a sum of two matrix algebras, each of which has a 
representation of dimension 2 n , and these are also both representations of the Pin group Pin^ J(JZ). On restriction to 
the spin group Spin^^R) these become isomorphic, so the spin group has a complex spinor representation of 
dimension 2 n . 

More generally, spinor groups and pin groups over any field have similar representations whose exact structure 
depends on the structure of the corresponding Clifford algebras: whenever a Clifford algebra has a factor that is a 
matrix algebra over some division algebra, we get a corresponding representation of the pin and spin groups over 
that division algebra. For examples over the reals see the article on spinors. 



Clifford algebra 



121 



Real spinors 

To describe the real spin representations, one must know how the spin group sits inside its Clifford algebra. The Pin 

group, Pin is the set of invertible elements in CI which can be written as a product of unit vectors: 
p.q p.q 

Pin P:q = {viv 2 . . . v r \ Vi ? || ^|| = ±1}. 

Comparing with the above concrete realizations of the Clifford algebras, the Pin group corresponds to the products 
of arbitrarily many reflections: it is a cover of the full orthogonal group 0(p,q). The Spin group consists of those 
elements of Pin which are products of an even number of unit vectors. Thus by the Cartan-Dieudonne theorem 

p,q 

Spin is a cover of the group of proper rotations SO(p,q). 

Let a : CI —> CI be the automorphism which is given by -Id acting on pure vectors. Then in particular, Spin^ ^ is the 
subgroup of Pin whose elements are fixed by a. Let 

PA 

Cl° Piq = {x€Cl p , q \a(x)=x}. 

(These are precisely the elements of even degree in CI .) Then the spin group lies within 

The irreducible representations of C£^ ^ restrict to give representations of the pin group. Conversely, since the pin 
group is generated by unit vectors, all of its irreducible representation are induced in this manner. Thus the two 
representations coincide. For the same reasons, the irreducible representations of the spin coincide with the 
irreducible representations of 

PA 

To classify the pin representations, one need only appeal to the classification of Clifford algebras. To find the spin 
representations (which are representations of the even subalgebra), one can first make use of either of the 
isomorphisms (see above) 

C£° 1 Jorq>0 
PA P,q-1 

C£° 1 ,for/7>0 
p.q q>p-i 

and realize a spin representation in signature (p,q) as a pin representation in either signature (p,q-l) or (q,p-l). 



Applications 
Differential geometry 

One of the principal applications of the exterior algebra is in differential geometry where it is used to define the 
bundle of differential forms on a smooth manifold. In the case of a (pseudo-)Riemannian manifold, the tangent 
spaces come equipped with a natural quadratic form induced by the metric. Thus, one can define a Clifford bundle in 
analogy with the exterior bundle. This has a number of important applications in Riemannian geometry. Perhaps 
more importantly is the link to a spin manifold, its associated spinor bundle and spin c manifolds. 



Physics 

Clifford algebras have numerous important applications in physics. Physicists usually consider a Clifford algebra to 
be an algebra spanned by matrices y^,...,y called Dirac matrices which have the property that 

Hlj + liH = 2r lij 

where r| is the matrix of a quadratic form of signature (p,q) — typically (1,3) when working in Minkowski space. 
These are exactly the defining relations for the Clifford algebra Cl^ 3 (C) (up to an unimportant factor of 2), which by 
the classification of Clifford algebras is isomorphic to the algebra of 4 by 4 complex matrices. 

The Dirac matrices were first written down by Paul Dirac when he was trying to write a relativistic first-order wave 
equation for the electron, and give an explicit isomorphism from the Clifford algebra to the algebra of complex 
matrices. The result was used to define the Dirac equation and introduce the Dirac operator. The entire Clifford 
algebra shows up in quantum field theory in the form of Dirac field bilinears. 



Clifford algebra 



122 



Computer Vision 

Recently, Clifford algebras have been applied in the problem of action recognition and classification in computer 
vision. Rodriguez et al. ^ propose a Clifford embedding to generalize traditional MACH filters to video (3D 
spatiotemporal volume), and vector-valued data such as optical flow. Vector-valued data is analyzed using the 
Clifford Fourier transform. Based on these vectors action filters are synthesized in the Clifford Fourier domain and 
recognition of actions is performed using Clifford Correlation. The authors demonstrate the effectiveness of the 
Clifford embedding by recognizing actions typically performed in classic feature films and sports broadcast 
television. 

Notes 

[1] Mathematicians who work with real Clifford algebras and prefer positive definite quadratic forms (especially those working in index theory) 

2 

sometimes use a different choice of sign in the fundamental Clifford identity. That is, they take v = -Q(y). One must replace Q with -Q in 

going from one convention to the other. 
[2] Thus the group algebra K[Z/2] is semisimple and the Clifford algebra splits into eigenspaces of the main involution. 
[3] We are still assuming that the characteristic is not 2. 

[4] The opposite is true when using the alternate (-) sign convention for Clifford algebras: it is the conjugate which is more important. In general, 
the meanings of conjugation and transpose are interchanged when passing from one sign convention to the other. For example, in the 
convention used here the inverse of a vector is given by y~ ^ = j Q(v) while in the (-) convention it is given by 

v- 1 =v/Q(v). 

[5] Rodriguez, Mikel; Shah, M (2008). "Action MACH: A Spatio-Temporal Maximum Average Correlation Height Filter for Action 
Classification". Computer Vision and Pattern Recognition (CVPR). 

References 

• Bourbaki, Nicolas (1988), Algebra, Berlin, New York: Springer- Verlag, ISBN 978-3-540-19373-9, section XI.9. 

• Carnahan, S. Borcherds Seminar Notes, Uncut. Week 5, "Spinors and Clifford Algebras". 

• Lawson, H. Blaine; Michelsohn, Marie-Louise (1989), Spin Geometry, Princeton, NJ: Princeton University Press, 
ISBN 978-0-691-08542-5. An advanced textbook on Clifford algebras and their applications to differential 
geometry. 

• Lounesto, Pertti (2001), Clifford algebras and spinors, Cambridge: Cambridge University Press, 
ISBN 978-0-521-00551-7 

• Porteous, Ian R. (1995), Clifford algebras and the classical groups, Cambridge: Cambridge University Press, 
ISBN 978-0-521-55177-9 

External links 

• Planetmath entry on Clifford algebras (http://planetmath.org/encyclopedia/CliffordAlgebra2.html) 

• A history of Clifford algebras (http://members.fortunecity.com/jonhays/clifhistory.htm) (unverified) 

• John Baez on Clifford algebras (http://www.math.ucr.edu/home/baez/octonions/node6.html) 



Distributions 



123 



Distributions 



In mathematical analysis, distributions (or generalized functions) are objects that generalize functions. 
Distributions make it possible to differentiate functions whose derivatives do not exist in the classical sense. In 
particular, any locally integrable function has a distributional derivative. Distributions are widely used to formulate 
generalized solutions of partial differential equations. Where a classical solution may not exist or be very difficult to 
establish, a distribution solution to a differential equation is often much easier. Distributions are also important in 
physics and engineering where many problems naturally lead to differential equations whose solutions or initial 
conditions are distributions, such as the Dirac delta distribution. 

Generalized functions were introduced by Sergei Sobolev in 1935. They were re-introduced in the late 1940s by 
Laurent Schwartz, who developed a comprehensive theory of distributions. 

Basic idea 



A 




-1 1 

A typical test function, the bump function It is smooth (infinitely 
differentiable) and has compact support (is zero outside an interval, in this 
case the interval [-1, 1]). 



Distributions are a class of linear functionals that map a set of test functions (conventional and well-behaved 
functions) onto the set of real numbers. In the simplest case, the set of test functions considered is D(R), which is the 
set of functions from R to R having two properties: 

• The function is smooth (infinitely differentiable); 

• The function has compact support (is identically zero outside some interval). 

Then, a distribution d is a mapping from D(R) to R. Instead of writing d( <fi ), where ^ is a test function in D(R), it 
is conventional to write (d, (fi) . A simple example of a distribution is the Dirac delta 6, defined by 

%>) = {6,<p)=<p(0). 

There are straightforward mappings from both locally integrable functions and probability distributions to 
corresponding distributions, as discussed below. However, not all distributions can be formed in this manner. 

Suppose that 

/:R^R 
is a locally integrable function, and let 

if : R^R 

be a test function in D(R). We can then define a corresponding distribution T^.by: 
(T f ,<p) = [ ftpdx. 

This integral is a real number which linearly and continuously depends on <fi . This suggests the requirement that a 
distribution should be linear and continuous over the space of test functions D(R), which completes the definition. In 
a conventional abuse of notation, / may be used to represent both the original function / and the distribution T^, 
derived from it. 



Distributions 



124 



Similarly, if P is a probability distribution on the reals and <p is a test function, then a corresponding distribution 
may be defined by: 

(T P ,<p) = [ ipdP 
Jn 

Again, this integral continuously and linearly depends on ^ , so that is in fact a distribution. 

Such distributions may be multiplied with real numbers and can be added together, so they form a real vector space. 
In general it is not possible to define a multiplication for distributions, but distributions may be multiplied with 
infinitely differentiable functions. 

It's desirable to choose a definition for the derivative of a distribution which, at least for distributions derived from 
locally integrable functions, has the property that (T^)' = T . If ^ is a test function, we can show that 

(T f ,,<p)= [ f'<pdx=- [ f<p'dx = -(T f ,<p') 
Jn Jn 

using integration by parts and noting that [f , {^)^p{^)]°^ QO — 0, since <^is zero outside of a bounded set. This 
suggests that if S is a distribution, we should define its derivative S' by 

" (S',<p) = -(S, ( p'). 

It turns out that this is the proper definition; it extends the ordinary definition of derivative, every distribution 
becomes infinitely differentiable and the usual properties of derivatives hold. 

Example: Recall that the Dirac delta (so-called Dirac delta function) is the distribution defined by 

(6, ip) = <p(0) 

It is the derivative of the distribution corresponding to the Heaviside step function H: For any test function ip , 

/oc roc 
H(x)<p'(x)dx = - / <p'(x)dx = p(0)-<p(oo) = <p{0) = (6, if) , 
-oc J0 

so (Tjy )' = S . Note, (f(oo) = 0 because of compact support. Similarly, the derivative of the Dirac delta is the 
distribution 

= V(o). 

This latter distribution is our first example of a distribution which is derived from neither a function nor a probability 
distribution. 

Test functions and distributions 

In the sequel, real-valued distributions on an open subset U of will be formally defined. With minor 
modifications, one can also define complex-valued distributions, and one can replace by any (paracompact) 
smooth manifold. 

The first object to define is the space D(U) of test functions on U. Once this is defined, it is then necessary to equip it 
with a topology by defining the limit of a sequence of elements of D(£/). The space of distributions will then be 
given as the space of continuous linear functionals on D(t/). 

Test function space 

The space D(U) of test functions on U is defined as follows. A function <p : U — » R is said to have compact support 
if there exists a compact subset K of U such that <p(x) = 0 for all x in U \ K. The elements of D(U) are the infinitely 
differentiable functions : U —> R with compact support — also known as bump functions. This is a real vector 
space. It can be given a topology by defining the limit of a sequence of elements of D(t/). A sequence ( ) in 
D(U) is said to converge to <p G D(U) if the following two conditions hold (Gelfand & Shilov 1966-1968, v. 1, 
§1.2): 



Distributions 



125 



• There is a compact set K C U containing the supports of all <p ^: 

[Jsupp(^ fc ) C K. 
k 

• For each multiindex a, the sequence of partial derivatives D a <p k tends uniformly to D a <fi . 

With this definition, D(U) becomes a complete locally convex topological vector space satisfying the Heine-Borel 
property (Rudin 1991, §6.4-5). If U is a countable nested family of open subsets of U with compact closures Ki=Ui 
, then 

D(U) = \jD Ki 

i 

where D / is the set of all smooth functions with support lying in K.. The topology on D(U) is the final topology of 

K i 

the family of nested metric spaces D^z and so D(t/) is an LF-space. The topology is not metrizable by the Baire 
category theorem, since D(U) is the union of subspaces of the first category in D(U) (Rudin 1991, §6.9). 

Distributions 

A distribution on U is a linear functional S : D(t/) -> R (or S : D(U) -> C), such that 

lim S(tp n ) = S I lim tp n ) 

for any convergent sequence <p ^ in D(U). The space of all distributions on U is denoted by D'(t/). Equivalently, the 
vector space D'(£/) is the continuous dual space of the topological vector space D(t/). 

The dual pairing between a distribution S in D'(£/) and a test function ifi in D(£/) is denoted using angle brackets 
thus: 

D'(U) x B(U) 3 (S, if) h-> (5, if) £ R. 
Equipped with the weak-* topology, the space D\U) is a locally convex topological vector space. In particular, a 
sequence (Sp in D'(t/) converges to a distribution S if and only if 

(S k ,<p) ^ {S,<p) 

for all test functions ip . This is the case if and only if converges uniformly to S on all bounded subsets of D(t/). 
(A subset E of D(U) is bounded if there exists a compact subset K of U and numbers such that every (p in £ has 
its support in and has its ft-th derivatives bounded by d^) 

Functions as distributions 

The function / : U —> R is called locally integrable if it is Lebesgue integrable over every compact subset K of U. 
This is a large class of functions which includes all continuous functions and all if functions. The topology on D(t/) 
is defined in such a fashion that any locally integrable function / yields a continuous linear functional on D([/) — 
that is, an element of D'(U) — denoted here by 7^, whose value on the test function <^is given by the Lebesgue 
integral: 

(T fz ip)= [ ftpdx. 
Ju 

Conventionally, one abuses notation by identifying 7^ with j\ provided no confusion can arise, and thus the pairing 
between / and <p is often written 

(f,<p) = (T f ,<p)- 

If / and g are two locally integrable functions, then the associated distributions 7^ and 7^ are equal to the same 
element of D\U) if and only if / and g are equal almost everywhere (see, for instance, Hormander (1983, Theorem 
1.2.5)). In a similar manner, every Radon measure \x on U defines an element of D\U) whose value on the test 
function ^ is J (fd\i. As above, it is conventional to abuse notation and write the pairing between a Radon 



Distributions 



126 



measure \x and a test function <fi as (fi,(p) . Conversely, essentially by the Riesz representation theorem, every distribution 
which is non-negative on non-negative functions is of this form for some (positive) Radon measure. 

The test functions are themselves locally integrable, and so define distributions. As such they are dense in D\U) with 
respect to the topology on D'(£/) in the sense that for any distribution S G D'(£/), there is a sequence n ^ D(£/) 
such that 

fa,*) -> (M) 

for all ip ED(U). This follows at once from the Hahn— Banach theorem, since by an elementary fact about weak 
topologies the dual of D'(t/) with its weak-* topology is the space D(U) (Rudin 1991, Theorem 3.10). This can also 
be proven more constructively by a convolution argument. 



Operations on distributions 

Many operations which are defined on smooth functions with compact support can also be defined for distributions. 
In general,,, if 

T : D(U) -> D(U) 

is a linear mapping of vector spaces which is continuous with respect to the weak-* topology, then it is possible to 
extend T to a mapping 

T : D'(U) -> D'(U) 

by passing to the limit. (This approach works for more general non-linear mappings as well, provided they are 
assumed to be uniformly continuous.) 

In practice, however, it is more convenient to define operations on distributions by means of the transpose (or adjoint 
transformation) (Strichartz 1994, §2.3; Treves 1967). If T : D(t/) — » D(£/) is a continuous linear operator, then the 
transpose is an operator T :D(U)—>D(U) such that 

(T<p,ip) = {ip,T*ip) 

for all <p , ip G D(£/). If such an operator T exists, and is continuous, then the original operator T may be extended 
to distributions by defining 

Differentiation 

If T : D(U) — » D(U) is given by the partial derivative 

T„ = |^. 
dx k 

By integration by parts, if f and 1]) are in D(U), then 

so that T = -T. This is a continuous linear transformation D(U) —> D(U). So, if S £ D\U) is a distribution, then the 
partial derivative of S with respect to the coordinate is defined by the formula 




\ dxk I \ dxk J 

for all test functions . In this way, every distribution is infinitely differentiable, and the derivative in the direction 
x k is a linear operator on D'(U). In general, if a = (a^ a ) is an arbitrary multi-index and d a denotes the 
associated mixed partial derivative operator, the mixed partial derivative d a S of the distribution S E D'(£/) is defined 
by 

{d"S, if) = (-l)W {S, &*ip) for aU <p G D(C7). 



Distributions 



127 



Differentiation of distributions is a continuous operator on U(U); this is an important and desirable property that is 
not shared by most other notions of differentiation. 

Multiplication by a smooth function 

If m : U — > R is an infinitely differentiable function and S is a distribution on U, then the product mS is defined by 
(mS)( ( fi) = S(m ) for all test functions tp . This definition coincides with the transpose transformation of 

T m : if h-> rmp 
for (p GD(U). Then, for any test function ip 

(T m ip,i>) = f m(x)<p(x)i>(x)dx = (ip,T m i>) 
J u 

so that T =T . Multiplication of a distribution S by the smooth function m is therefore defined by 

mm 

mSfy) = (mS. ip) = (£, rmj)) = S(mip). 
Under multiplication by smooth functions, D\U) is a module over the ring C°°(D). With this definition of 
multiplication by a smooth function, the ordinary product rule of calculus remains valid. However, a number of 
unusual identities also arise. For example, the Dirac delta distribution 6 is defined on R by (6, tp) = *P (OX an d 
its derivative is given by (6', <p ) =- (S, <fi') =- *P '(0)- However, the product m6' is the distribution 

mS' = m(0)5' - m!8. 

This definition of multiplication also makes it possible to define the operation of a linear differential operator with 
smooth coefficients on a distribution. A linear differential operator takes a distribution S G T>\U) to another 
distribution given by a sum of the form 

ps = p^ s 

\a\<k 

where the coefficients are smooth functions in U. If P is a given differential operator, then the minimum integer k 
for which such an expansion holds for every distribution S is called the order of P. The transpose of P is given by 

The space D\U) is a D-module with respect to the action of the ring of linear differential operators. 

Composition with a smooth function 

Let S be a distribution on an open set U C R^. Let V be an open set in R n , and F : V —> U. Then provided F is a 
submersion, it is possible to define 

SoF£D'(V). 

This is the composition of the distribution S with F, and is also called the pullback of S along F, sometimes written 
F* : S i-> F*S = S o F. 

The pullback is often denoted F , but this notation risks confusion with the above use of '*' to denote the transpose of 
a linear mapping. 

The condition that F be a submersion is equivalent to the requirement that the Jacobian derivative dF(x) of F is a 
surjective linear map for every x E V. A necessary (but not sufficient) condition for extending F # to distributions is 
that F be an open mapping (Hormander 1983, Theorem 6.1.1). The inverse function theorem ensures that a 
submersion satisfies this condition. 

If F is a submersion, then F # is defined on distributions by finding the transpose map. Uniqueness of this extension is 
guaranteed since F # is a continuous linear operator on D(£/). Existence, however, requires using the change of 
variables formula, the inverse function theorem (locally) and a partition of unity argument; see Hormander (1983, 



Distributions 



128 



Theorem 6.1.2). 

In the special case when F is a diffeomorphism from an open subset V of onto an open subset U of change of 
variables under the integral gives 

J ip o F{x)ifj{x) dx = J if{x)^{F- l {x))\detdF- 1 {x)\dx. 
In this particular case, then, F is defined by the transpose formula: 
{F^S.if) = (S^detdiF-^lipoF- 1 ). 

Localization of distributions 

There is no way to define the value of a distribution in D\U) at a particular point of U. However, as is the case with 
functions, distributions on U restrict to give distributions on open subsets of U. Furthermore, distributions are locally 
determined in the sense that a distribution on all of U can be assembled from a distribution on an open cover of U 
satisfying some compatibility conditions on the overlap. Such a structure is known as a sheaf. 

Restriction 

Let U and V be open subsets of R n with VCU. Let E : D(V) —>D(U) be the operator which extends by zero a 
given smooth function compactly supported in V to a smooth function compactly supported in the larger set U. Then 
the restriction mapping p ^is defined to be the transpose of E yu . Thus for any distribution S G D\U), the restriction 
p S is a distribution in the dual space D'( V) defined by 

(pvuS, if) = (S, E VU ip) 
for all test functions ip G D(V). 

Unless U=V, the restriction to V is neither injective nor surjective. Lack of surjectivity follows since distributions 
can blow up towards the boundary of V. For instance, if U = R and V = (0,2), then the distribution 

S(x) = f^nS(x-^j 

is in D'(V) but admits no extension to D\U). 
Support of a distribution 

Let S £ D'(U) be a distribution on an open set U. Then S is said to vanish on an open set V of U if S lies in the kernel 
of the restriction map p^. Explicitly S vanishes on V if 

(5, = 0 

for all test functions G C°°(U) with support in V. Let V be a maximal open set on which the distribution S 
vanishes; i.e., Vis the union of every open set on which S vanishes. The support of S is the complement of V in U. 
Thus 

su PP 5 = C/-|J{^|^ [ /5 = 0}. 

The distribution S has compact support if its support is a compact set. Explicitly, S has compact support if there is a 
compact subset K of U such that for every test function <p whose support is completely outside of K, we have S( <p 
) = 0. Compactly supported distributions define continuous linear functions on the space C°°(U) m , the topology on 
C°°(U) is defined such that a sequence of test functions <fi k converges to 0 if and only if all derivatives of ¥ k 
converge uniformly to 0 on every compact subset of U. Conversely, it can be shown that every continuous linear 
functional on this space defines a distribution of compact support. 



Distributions 



129 



Tempered distributions and Fourier transform 

By using a larger space of test functions, one can define the tempered distributions, a subspace of D'(R W ). These 
distributions are useful if one studies the Fourier transform in generality: all tempered distributions have a Fourier 
transform, but not all distributions have one. 

The space of test functions employed here, the so-called Schwartz space S(R n ), is the function space of all infinitely 
differentiable functions that are rapidly decreasing at infinity along with all partial derivatives. Thus <fi : — » R is 
in the Schwartz space provided that any derivative of <p , multiplied with any power of bcl, converges towards 0 for 
\x\—>oo. These functions form a complete topological vector space with a suitably defined family of seminorms. 
More precisely, let 

p atP (<p) = sup \x a D^(x)\ 

for a, (3 multi-indices of size n. Then <fi is a Schwartz function if all the values 

p a A<p) < °°- 

The family of seminorms ^ defines a locally convex topology on the Schwartz- space. The seminorms are, in fact, 
norms on the Schwartz space, since Schwartz functions are smooth. The Schwartz space is metrizable and complete. 

The space of tempered distributions is defined as the (continuous) dual of the Schwartz space. In other words, a 
distribution F is a tempered distribution if and only if 

lim F(<p m ) = 0. 

is true whenever, 

holds for all multi-indices a, (3. 

The derivative of a tempered distribution is again a tempered distribution. Tempered distributions generalize the 
bounded (or slow-growing) locally integrable functions; all distributions with compact support and all 
square-integrable functions are tempered distributions. All locally integrable functions / with at most polynomial 
growth, i.e. such that/(x) = 0(lxl r ) for some r, are tempered distributions. This includes all functions in L p (R n ) for 

P >\. 

The tempered distributions can also be characterized as slowly growing. This characterization is dual to the rapidly 
falling behaviour, e.g. oc • exp(— x 2 ) , of the test functions. 

To study the Fourier transform, it is best to consider complex-valued test functions and complex-linear distributions. 
The ordinary continuous Fourier transform F yields then an automorphism of Schwartz function space, and we can 
define the Fourier transform of the tempered distribution S by (FS)(\|0 = S(Fty) for every test function ip. FS is 
thus again a tempered distribution. The Fourier transform is a continuous, linear, bijective operator from the space of 
tempered distributions to itself. This operation is compatible with differentiation in the sense that 

dS . 
F— = ixFS 
dx 

and also with convolution: if S is a tempered distribution and ip is a slowly increasing infinitely differentiable 
function on R^ (meaning that all derivatives of grow at most as fast as polynomials), then Sty is again a tempered 
distribution and 

F(SV) = FS*FiP 

is the convolution of FS and Fty. 



Distributions 



130 



Convolution 

Under some circumstances, it is possible to define the convolution of a function with a distribution, or even the 
convolution of two distributions. 

Convolution of a test function with a distribution 

If/G D(R W ) is a compactly supported smooth test function, then convolution with/defines an operator 

C f : D(R n ) -> D(R n ) 

defined by =f*g, which is linear (and continuous with respect to the LF space topology on D(R W ).) 

Convolution of / with a distribution S £ D'(R W ) can be defined by taking the transpose of relative to the duality 
pairing of D(R n ) with the space D'(R n ) of distributions (Treves 1967, Chapter 27). If/, g, if G D(R n ), then by 
Fubini's theorem 

(C f g, <p)= <p{x) / f(x - y)g(y) dydx = {g, Cpp) 
where f(x)=f(-x) . Extending by continuity, the convolution off with a distribution S is defined by 

(f*S,<p) = {Sj*<p) 

for all test functions <p G D(R"). 

An alternative way to define the convolution of a function /and a distribution S is to use the translation operator 
defined on test functions by 

r x tp(y) = tp(y - x) 

and extended by the transpose to distributions in the obvious way (Rudin 1991, §6.29). The convolution of the 
compactly supported function /and the distribution S is then the function defined for each x G R n by 

(f*S)(x) = {S,rJ). 

It can be shown that the convolution of a compactly supported function and a distribution is a smooth function. If the 
distribution S has compact support as well, then f*S is a compactly supported function, and the Titchmarsh 
convolution theorem (Hormander 1983, Theorem 4.3.3) implies that 

ch(/ * S) = chf + chS 

where ch denotes the convex hull. 
Distribution of compact support 

It is also possible to define the convolution of two distributions S and T on R^, provided one of them has compact 
support. Informally, in order to define S*T where T has compact support, the idea is to extend the definition of the 
convolution * to a linear operation on distributions so that the associativity formula 

S * (T * (p) = (S *T) * (f 
continues to hold for all test-functions ip . Hormander (1983, §IV.2) proves the uniqueness of such an extension. 

It is also possible to provide a more explicit characterization of the convolution of distributions (Treves 1967, 
Chapter 27). Suppose that it is Tthat has compact support. For any test function <fi in D(R W ), consider the function 

1>(x) = {T,T- x <p). 

It can be readily shown that this defines a smooth function of x, which moreover has compact support. The 
convolution of S and T is defined by 

(S*T,<p) = (S,iP). 

This generalizes the classical notion of convolution of functions and is compatible with differentiation in the 
following sense: 

&*{S *T) = (&*S) *T = S* (d a T). 



Distributions 



131 



This definition of convolution remains valid under less restrictive assumptions about S and T\ see for instance 
Gel'fand & Shilov (1966-1968, v. 1, pp. 103-104) and Benedetto (1997, Definition 2.5.8). 

Distributions as derivatives of continuous functions 

The formal definition of distributions exhibits them as a subspace of a very large space, namely the algebraic dual of 
D([/) (or S(R^) for tempered distributions). It is not immediately clear from the definition how exotic a distribution 
might be. To answer this question, it is instructive to see distributions built up from a smaller space, namely the 
space of continuous functions. Roughly, any distribution is locally a (multiple) derivative of a continuous function. A 
precise version of this result, given below, holds for distributions of compact support, tempered distributions, and 
general distributions. Generally speaking, no proper subset of the space of distributions contains all continuous 
functions and is closed under differentiation. This says that distributions are not particularly exotic objects; they are 
only as complicated as necessary. 

Tempered distributions 

If / G S'(R n ) is a tempered distribution, then there exists a constant C > 0, and positive integers M and N such that for 
all Schwartz functions <f eS(R n ) 

(fM<c £ su p = c E p«A<p)- 

\ct\<N,\p\<M xeUn \<x\<N : \P\<M 
This estimate along with some techniques from functional analysis can be used to show that there is a continuous 
slowly increasing function F and a multiindex a such that 

/ = D a F. 

Compactly supported distributions 

Let U be an open set, and K a compact subset of U. If / is a distribution supported on K, then there is a continuous 
function F compactly supported in U (possibly on a larger set than K itself) such that 

f = D a F 

for some multi-index a. This follows from the previously quoted result on tempered distributions by means of a 
localization argument. 

Distributions with point support 

If / has support at a single point {P}, then / is in fact a finite linear combination of distributional derivatives of the 6 
function at P. That is, there exists an integer m and complex constants a for multi indices loci < m such that 

/= £ o, a D a {r P 5) 

|c*|<m 

where x p is the translation operator. 
General distributions 

A version of the above theorem holds locally in the following sense (Rudin 1991). Let S be a distribution on U. Then 
one can find for every multi-index a a continuous function g^ such that 

S = J2D a 9a 

a 

and that any compact subset K of U intersects the supports of only finitely many g^; therefore, to evaluate the value 
of S for a given smooth function / compactly supported in U, we only need finitely many g^ hence the infinite sum 
above is well-defined as a distribution. If the distribution S is of finite order, then one can choose g^ in such a way 
that only finitely many of them are nonzero. 



Distributions 



132 



Using holomorphic functions as test functions 

The success of the theory led to investigation of the idea of hyperf unction, in which spaces of holomorphic functions 
are used as test functions. A refined theory has been developed, in particular Mikio Sato's algebraic analysis, using 
sheaf theory and several complex variables. This extends the range of symbolic methods that can be made into 
rigorous mathematics, for example Feynman integrals. 



Problem of multiplication 

A possible limitation of the theory of distributions (and hyperf unctions) is that it is a purely linear theory, in the 
sense that the product of two distributions cannot consistently be defined (in general), as has been proved by Laurent 
Schwartz in the 1950s. For example, if p.v. l/x is the distribution obtained by the Cauchy principal value 

(p.v±)[<t>}= lim / ^-dx 
for all <fi E S(R), and 6 is the Dirac delta distribution then 

(5 X x) X p.v.- = 0 
x 

but 

S X (^x X p.v.-^j = S 

so the product of a distribution by a smooth function (which is always well defined) cannot be extended to an 
associative product on the space of distributions. 

Thus, nonlinear problems cannot be posed in general and thus not solved within distribution theory alone. In the 
context of quantum field theory, however, solutions can be found. In more than two spacetime dimensions the 
problem is related to the regularization of divergences. Here Henri Epstein and Vladimir Glaser developed the 
mathematically rigorous (but extremely technical) causal perturbation theory. This does not solve the problem in 
other situations. Many other interesting theories are non linear, like for example Navier-Stokes equations of fluid 
dynamics. 

In view of this, several not entirely satisfactory theories of algebras of generalized functions have been developed, 
among which Colombeau's (simplified) algebra is maybe the most popular in use today. 

A simple solution of the multiplication problem is dictated by the path integral formulation of quantum mechanics. 
Since this is required to be equivalent to the Schrodinger theory of quantum mechanics which is invariant under 
coordinate transformations, this property must be shared by path integrals. This fixes all products of distributions as 
shown by Kleinert & Chervyakov (2001) The result is equivalent to what can be derived from dimensional 
regularization (Kleinert & Chervyakov 2000). 



References 

• Benedetto, J.J. (1997), Harmonic Analysis and Applications, CRC Press. 

• Gel'fand, I.M.; Shilov, G.E. (1966-1968), Generalized functions, 1-5, Academic Press. 

• Hormander, L. (1983), The analysis of linear partial differential operators I, Grundl. Math. Wissenschaft., 256, 
Springer, MR0717035, ISBN 3-540-12104-8. 

• Kleinert, H.; Chervyakov, A. (2001), "Rules for integrals over products of distributions from coordinate 
independence of path integrals" Europ. Phys. J. C 19: 743-747, doi:10.1007/sl00520100600. 

• Kleinert, H.; Chervyakov, A. (2000), "Coordinate Independence of Quantum-Mechanical Path Integrals" , 
Phys. Lett. A 269: 63, doi:10.1016/S0375-9601(00)00475-8. 

• Rudin, W. (1991), Functional Analysis (2nd ed.), McGraw-Hill, ISBN 0-07-054236-8. 



Distributions 



133 



• Schwartz, L. (1954), "Sur l'impossibilite de la multiplications des distributions", C.R.Acad. Sci. Paris 239: 
847-848. 

• Schwartz, L. (1950-1951), Theorie des distributions, 1-2, Hermann. 

• Stein, Elias; Weiss, Guido (1971), Introduction to Fourier Analysis on Euclidean Spaces, Princeton University 
Press, ISBN 0-691-08078-X. 

• Strichartz, R. (1994), A Guide to Distribution Theory and Fourier Transforms, CRC Press, ISBN 0849382734. 

• Treves, Francois (1967), Topological Vector Spaces, Distributions and Kernels, Academic Press, pp. 126 ff. 

Further reading 

• M. J. Lighthill (1959). Introduction to Fourier Analysis and Generalised Functions. Cambridge University Press. 
ISBN 0-521-09128-4 (requires very little knowledge of analysis; defines distributions as limits of sequences of 
functions under integrals) 

• H. Kleinert, Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets, 4th 
edition, World Scientific (Singapore, 2006) ^(also available online here ^). See Chapter 11 for defining 
products of distributions from the physical requirement of coordinate invariance. 

• Vladimirov, V.S. (2001), "Generalized function" ^, in Hazewinkel, Michiel, Encyclopaedia of Mathematics, 
Springer, ISBN 978-1556080104. 

• Vladimirov, V.S. (2001), "Generalized functions, space of" in Hazewinkel, Michiel, Encyclopaedia of 
Mathematics, Springer, ISBN 978-1556080104. 

T71 

• Vladimirov, V.S. (2001), "Generalized function, derivative of a" L '\in Hazewinkel, Michiel, Encyclopaedia of 
Mathematics, Springer, ISBN 978-1556080104. 

• Vladimirov, V.S. (2001), "Generalized functions, product of" ^\ in Hazewinkel, Michiel, Encyclopaedia of 
Mathematics, Springer, ISBN 978-1556080104. 

• Oberguggenberger, Michael (2001), "Generalized function algebras" J , in Hazewinkel, Michiel, Encyclopaedia 
of Mathematics, Springer, ISBN 978-1556080104. 

References 

[1] http://www.physik.fu-berlin.de/~kleinert/kleiner_re303/wardepl.pdf 

[2] http://www.physik.fu-berlin.de/~kleinert/305/klch2.pdf 

[3] http ://www . worldscibooks . com/phy sics/6223 . html 

[4] http://www.physik.fu-berlin.de/~kleinert/b5 

[5] http://eom.springer.de/G/g043810.htm 

[6] http://eom. springer. de/G/g043 840.htm 

[7] http://eom.springer.de/G/g043820.htm 

[8] http://eom.springer.de/G/g043830.htm 

[9] http://eom. springer. de/G/g 1 30030.htm 



Hilbert space 



134 



Hilbert space 




Hilbert spaces can be used to study the harmonics 
of vibrating strings. 



The mathematical concept of a Hilbert space, named after David 
Hilbert, generalizes the notion of Euclidean space. It extends the 
methods of vector algebra and calculus from the two-dimensional 
Euclidean plane and three-dimensional space to spaces with any finite 
or infinite number of dimensions. A Hilbert space is an abstract vector 
space possessing the structure of an inner product that allows length 
and angle to be measured. Furthermore, Hilbert spaces are required to 
be complete, a property that stipulates the existence of enough limits in 
the space to allow the techniques of calculus to be used. 

Hilbert spaces arise naturally and frequently in mathematics, physics, 
and engineering, typically as infinite-dimensional function spaces. The 
earliest Hilbert spaces were studied from this point of view in the first 
decade of the 20th century by David Hilbert, Erhard Schmidt, and 
Frigyes Riesz. They are indispensable tools in the theories of partial differential equations, quantum mechanics, 
Fourier analysis (which includes applications to signal processing and heat transfer) and ergodic theory which forms 
the mathematical underpinning of the study of thermodynamics. John von Neumann coined the term "Hilbert space" 
for the abstract concept underlying many of these diverse applications. The success of Hilbert space methods ushered 
in a very fruitful era for functional analysis. Apart from the classical Euclidean spaces, examples of Hilbert spaces 
include spaces of square-integrable functions, spaces of sequences, Sobolev spaces consisting of generalized 
functions, and Hardy spaces of holomorphic functions. 

Geometric intuition plays an important role in many aspects of Hilbert space theory. Exact analogs of the 
Pythagorean theorem and parallelogram law hold in a Hilbert space. At a deeper level, perpendicular projection onto 
a subspace (the analog of "dropping the altitude" of a triangle) plays a significant role in optimization problems and 
other aspects of the theory. An element of a Hilbert space can be uniquely specified by its coordinates with respect to 
a set of coordinate axes (an orthonormal basis), in analogy with Cartesian coordinates in the plane. When that set of 
axes is countably infinite, this means that the Hilbert space can also usefully be thought of in terms of infinite 
sequences that are square- summable. Linear operators on a Hilbert space are likewise fairly concrete objects: in good 
cases, they are simply transformations that stretch the space by different factors in mutually perpendicular directions 
in a sense that is made precise by the study of their spectral theory. 



Definition and illustration 

Motivating example: Euclidean space 

One of the most familiar examples of a Hilbert space is the Euclidean space consisting of three-dimensional vectors, 

3 

denoted by R , and equipped with the dot product. The dot product takes two vectors x and y, and produces a real 
number xy. If x and y are represented in Cartesian coordinates, then the dot product is defined by 

(x u x 2 , x 3 ) • (y u y 2) jft) = rcij/i + x 2 y 2 + x 3 y 3 . 
The dot product satisfies the properties: 

1. It is symmetric in x and y: x y = y x. 

2. It is linear in its first argument: (ax^ + bx^-y = ax^-y + bx^y for any scalars a, b, and vectors x^ x^ and y. 

3. It is positive definite: for all vectors x, x x > 0 with equality if and only if x = 0. 



Hilbert space 



135 



An operation on pairs of vectors that, like the dot product, satisfies these three properties is known as a (real) inner 
product. A vector space equipped with such an inner product is known as a (real) inner product space. Every 
finite-dimensional inner product space is also a Hilbert space. The basic feature of the dot product that connects it 
with Euclidean geometry is that it is related to both the length (or norm) of a vector, denoted llxll, and to the angle 9 
between two vectors x and y by means of the formula 

x • y = ||x|| ||y|| cos0. 
Multivariable calculus in Euclidean space relies on the ability to 
compute limits, and to have useful criteria for concluding that limits 
exist. A mathematical series 




Completeness means that if a particle moves 
along the broken path (in blue) travelling a finite 
total distance, then the particle has a well-defined 
net displacement (in orange). 



oc 

71=0 



3 

consisting of vectors in R is absolutely convergent provided that the sum of the lengths converges as an ordinary 
series of real numbers: ^ 

5^||x fc || < oo. 

fc=0 

Just as with a series of scalars, a series of vectors that converges absolutely also converges to some limit vector L in 
the Euclidean space, in the sense that 



N 



0 as N — > oo. 



This property expresses the completeness of Euclidean space: that a series which converges absolutely also 
converges in the ordinary sense. 



Definition 

A Hilbert space H is a real or complex inner product space that is also a complete metric space with respect to the 

T21 

distance function induced by the inner product. To say that H is a complex inner product space means that H is a 
complex vector space on which there is an inner product (x,y) associating a complex number to each pair of elements 
x,y of H that satisfies the following properties: 

• (y,x) is the complex conjugate of (x,y)\ 

(y,x) = (x : y). 

ran 

• (x,y) is linear in its first argument. For all complex numbers a and b, 

(ax 1 + bx 2 ,y) = a{x u y) + b(x 2 ,y). 

• The inner product is positive definite: 

(x,x) > 0 

where the case of equality holds precisely when x = 0. 



Hilbert space 



136 



It follows from properties 1 and 2 that a complex inner product is antilinear in its second argument, meaning that 

(x,ay 1 + by 2 ) = ofoyi) + b{x,y 2 ). 
A real inner product space is defined in the same way, except that H is a real vector space and the inner product takes 
real values. Such an inner product will be bilinear: that is, linear in each argument. 

The norm defined by the inner product (•,•) is the real- valued function 
\\ x \\ = \/( x , x )i 

and the distance between two points x,y in H is defined in terms of the norm by 

d(x, y) = \\x - y || = ^ {x - y, x - y). 
That this function is a distance function means (1) that it is symmetric in x and y, (2) that the distance between x and 
itself is zero, and otherwise the distance between x and y must be positive, and (3) that the triangle inequality holds, 
meaning that the length of one leg of a triangle xyz cannot exceed the sum of the lengths of the other two legs: 

d(x,z) < d(x,y) + d(y 9 z). 







\d(y.z) 


x u - 













This last property is ultimately a consequence of the more fundamental Cauchy-Schwarz inequality, which asserts 

\{x,y)\ < Ml ||y|| 

with equality if and only if x and y are linearly dependent. 

Relative to a distance function defined in this way, any inner product space is a metric space, and sometimes is 
known as a pre-Hilbert space A pre-Hilbert space is a Hilbert space if in addition it is complete. Completeness is 
expressed using a form of the Cauchy criterion for sequences in H: a pre-Hilbert space H is complete if every 
Cauchy sequence converges with respect to this norm to an element in the space. Completeness can be characterized 
by the following equivalent condition: if a series of vectors YlkLo u k converges absolutely in the sense that 

oc 

]T \\u k \\ < OO, 

then the series converges in H, in the sense that the partial sums converge to an element of H. 

As a complete normed space, Hilbert spaces are by definition also Banach spaces. As such they are topological 
vector spaces, in which topological notions like the openness and closedness of subsets are well-defined. Of special 
importance is the notion of a closed linear subspace of a Hilbert space which, with the inner product induced by 
restriction, is also complete (being a closed set in a complete metric space) and therefore a Hilbert space in its own 
right. 



Hilbert space 



137 



Second example: sequence spaces 

2 

The sequence space D consists of all infinite sequences z = (z^,...) of complex numbers such that the series 

oo 

71=1 

2 

converges. The inner product on D is defined by 

oo 

71=1 

with the latter series converging as a consequence of the Cauchy-Schwarz inequality. 

2 

Completeness of the space holds provided that whenever a series of elements from D converges absolutely (in norm), 

2 

then it converges to an element of D . The proof is basic in mathematical analysis, and permits mathematical series of 
elements of the space to be manipulated with the same ease as series of complex numbers (or vectors in a 
finite-dimensional Euclidean space) 



History 

Prior to the development of Hilbert spaces, other generalizations of 

Euclidean spaces were known to mathematicians and physicists. In 

particular, the idea of an abstract linear space had gained some traction 

towards the end of the 19th century -J® this is a space whose elements 

can be added together and multiplied by scalars (such as real or 

complex numbers) without necessarily identifying these elements with 

"geometric" vectors, such as position and momentum vectors in 

physical systems. Other objects studied by mathematicians at the turn 

of the 20th century, in particular spaces of sequences (including series) 
T71 

and spaces of functions, can naturally be thought of as linear spaces. 
Functions, for instance, can be added together or multiplied by 
constant scalars, and these operations obey the algebraic laws satisfied 
by addition and scalar multiplication of spatial vectors. 

In the first decade of the 20th century, parallel developments led to the 

introduction of Hilbert spaces. The first of these was the observation, 

which arose during David Hilbert and Erhard Schmidt's study of 
rsi 

integral equations, that two square-integrable real-valued functions/ 
and g on an interval [a,b] have an inner product 

(f,9) = / f(x)g(x)dx 

J a 

which has many of the familiar properties of the Euclidean dot product. In particular, the idea of an orthogonal 
family of functions has meaning. Schmidt exploited the similarity of this inner product with the usual dot product to 
prove an analog of the spectral decomposition for an operator of the form 

f b 

f(x) h-> / K(x,y)f(y)dy 

J a 

where K is a continuous function symmetric in x and y. The resulting eigenfunction expansion expresses the function 
K as a series of the form 

71 




Hilbert space 



138 



where the functions w are orthogonal in the sense that (w ,w ) = 0 for all n ± m. The individual terms in this 

n n m 

series are sometimes referred to as elementary product solutions. However, there are eigenfunction expansions which 
fail to converge in a suitable sense to a square-integrable function: the missing ingredient, which ensures 
convergence, is completeness. 1 

The second development was the Lebesgue integral, an alternative to the Riemann integral introduced by Henri 
Lebesgue in 1904.^ The Lebesgue integral made it possible to integrate a much broader class of functions. In 1907, 

2 

Frigyes Riesz and Ernst Sigismund Fischer independently proved that the space L of square Lebesgue-integrable 
functions is a complete metric spaced 1 ^ As a consequence of the interplay between geometry and completeness, the 
19th century results of Joseph Fourier, Friedrich Bessel and Marc-Antoine Parseval on trigonometric series easily 
carried over to these more general spaces, resulting in a geometrical and analytical apparatus now usually known as 
the Riesz-Fischer theorem J 12 ^ 

Further basic results were proved in the early 20th century. For example, the Riesz representation theorem was 

ri3i 

independently established by Maurice Frechet and Frigyes Riesz in 1907. John von Neumann coined the term 

abstract Hilbert space in his work on unbounded Hermitian operators J 14 ^ Although other mathematicians such as 

Hermann Weyl and Norbert Wiener had already studied particular Hilbert spaces in great detail, often from a 

physically-motivated point of view, von Neumann gave the first complete and axiomatic treatment of themJ 15] Von 

Neumann later used them in his seminal work on the foundations of quantum mechanics, ^ 16 ^ and in his continued 

work with Eugene Wigner. The name "Hilbert space" was soon adopted by others, for example by Hermann Weyl in 

ri7i 

his book on quantum mechanics and the theory of groups. 

The significance of the concept of a Hilbert space was underlined with the realization that it offers one of the best 

ri8i 

mathematical formulations of quantum mechanics. In short, the states of a quantum mechanical system are 

vectors in a certain Hilbert space, the observables are hermitian operators on that space, the symmetries of the 

system are unitary operators, and measurements are orthogonal projections. The relation between quantum 

mechanical symmetries and unitary operators provided an impetus for the development of the unitary representation 

ri7i 

theory of groups, initiated in the 1928 work of Hermann Weyl. On the other hand, in the early 1930s it became 
clear that certain properties of classical dynamical systems can be analyzed using Hilbert space techniques in the 
framework of ergodic theory J 

The algebra of observables in quantum mechanics is naturally an algebra of operators defined on a Hilbert space, 
according to Werner Heisenberg's matrix mechanics formulation of quantum theory. Von Neumann began 
investigating operator algebras in the 1930s, as rings of operators on a Hilbert space. The kind of algebras studied by 
von Neumann and his contemporaries are now known as von Neumann algebras. In the 1940s, Israel Gelfand, Mark 
Naimark and Irving Segal gave a definition of a kind of operator algebras called C*-algebras that on the one hand 
made no reference to an underlying Hilbert space, and on the other extrapolated many of the useful features of the 
operator algebras that had previously been studied. The spectral theorem for self-adjoint operators in particular that 
underlies much of the existing Hilbert space theory was generalized to C*-algebras. These techniques are now basic 
in abstract harmonic analysis and representation theory. 



Hilbert space 



139 



Examples 

Lebesgue spaces 

Lebesgue spaces are function spaces associated to measure spaces (X, M, fi), where X is a set, M is a a-algebra of 
subsets of X, and fi is a countably additive measure on M. Let L (X,\x) be the space of those complex- valued 
measurable functions on X for which the Lebesgue integral of the square of the absolute value of the function is 
finite, i.e., for a function fin L 2 (X,\i), 

\f\ 2 dfJ. < 00, 

and where functions are identified if and only if they differ only on a set of measure zero. 

2 

The inner product of functions /and g in L (X,\x) is then defined as 

(/,<?>= / f{tW)M*)- 

J X 

2 

For /and g in L , this integral exists because of the Cauchy-Schwarz inequality, and defines an inner product on the 

2 r20i 

space. Equipped with this inner product, L is in fact complete. The Lebesgue integral is essential to ensure 

T211 

completeness: on domains of real numbers, for instance, not enough functions are Riemann integrable. 

2 2 

The Lebesgue spaces appear in many natural settings. The spaces L (R) and L ([0,1]) of square-integrable functions 
with respect to the Lebesgue measure on the real line and unit interval, respectively, are natural domains on which to 
define the Fourier transform and Fourier series. In other situations, the measure may be something other than the 
ordinary Lebesgue measure on the real line. For instance, if w is any positive measurable function, the space of all 
measurable functions /on the interval [0,1] satisfying 

f 1 \f(t)\ 2 w(t) dt < oo 
Jo 

is called the weighted L 2 space L % „ ([0,1]), and w is called the weight function. The inner product is defined by 

</,<?)= f f(t)W)w{t)dt. 
Jo 

The weighted space L ^,,([0,1]) is identical with the Hilbert space L 2 ([0,l],|i) where the measure \x of a 
Lebesgue-measurable set A is defined by 

fi(A) = [ w(t)dt. 

J A 

2 

Weighted L spaces like this are frequently used to study orthogonal polynomials, because different families of 
orthogonal polynomials are orthogonal with respect to different weighting functions. 



Sobolev spaces 

s s 2 

Sobolev spaces, denoted by H or W ' , are Hilbert spaces. These are a special kind of function space in which 

differentiation may be performed, but which (unlike other Banach spaces such as the Holder spaces) support the 

structure of an inner product. Because differentiation is permitted, Sobolev spaces are a convenient setting for the 

T221 

theory of partial differential equations. They also form the basis of the theory of direct methods in the calculus of 
variations/ 23 ^ 

For s a non-negative integer and £2 C R^, the Sobolev space ffid) contains L 2 functions whose weak derivatives of 
order up to s are also L 2 . The inner product in 7/ s (£2) is 

</,<?}= / f(x)g(x)dx+ f Df-Dg(x)dx + ...+ f D s f(x) ■ D s g{x)dx 
Jo, Jn Jn 

where the dot indicates the dot product in the Euclidean space of partial derivatives of each order. Sobolev spaces 
can also be defined when s is not an integer. 



Hilbert space 



140 



Sobolev spaces are also studied from the point of view of spectral theory, relying more specifically on the Hilbert 
space structure. If £1 is a suitable domain, then one can define the Sobolev space H s (£l) as the space of Bessel 
potentials;^ roughly, 

F(n) = {(l-A)- s / 2 /|/a 2 (n)}. 

—s/2 

Here A is the Laplacian and (1 - A) is understood in terms of the spectral mapping theorem. Apart from 
providing a workable definition of Sobolev spaces for non-integer s, this definition also has particularly desirable 
properties under the Fourier transform that make it ideal for the study of pseudodifferential operators. Using these 
methods on a compact Riemannian manifold, one can obtain for instance the Hodge decomposition which is the 
basis of Hodge theory. 1 

Spaces of holomorphic functions 

Hardy spaces 

The Hardy spaces are function spaces, arising in complex analysis and harmonic analysis, whose elements are 

certain holomorphic functions in a complex domain J 26 -' Let U denote the unit disc in the complex plane. Then the 

2 

Hardy space H (U) is defined to be the space of holomorphic functions /on U such that the means 



1 r 2lT 

M r(f) = ^J 0 \f(re i6 )\ 2 de 



remain bounded for r < 1 . The norm on this Hardy space is defined by 



a = lim yjMriJ). 

es in the disc a 
oc 

f(z) = £ a„z" 



2 

Hardy spaces in the disc are related to Fourier series. A function /is in H (U) if and only if 



71=0 

where 



£ M 2 < oo. 



71=0 

2 2 
Thus H (U) consists of those functions which are L on the circle, and whose negative frequency Fourier coefficients 

vanish. 

Bergman spaces 

[27] 

The Bergman spaces are another family of Hilbert spaces of holomorphic functions. Let D be a bounded open set 

2 h 

in the complex plane (or a higher dimensional complex space) and let L ' (D) be the space of holomorphic functions 

2 

/in D that are also in L (D) in the sense that 



= / |/(*)| 2 d/x(z)<oo, 

JD 



2 h 2 

where the integral is taken with respect to the Lebesgue measure in D. Clearly L ' (D) is a subspace of L {D)\ in fact, 
it is a closed subspace, and so a Hilbert space in its own right. This is a consequence of the estimate, valid on 
compact subsets K of D, that 

Biipl/tol^Cjcll/Ha, 
zeK 

which in turn follows from Cauchy's integral formula. Thus convergence of a sequence of holomorphic functions in 

2 

L (D) implies also compact convergence, and so the limit function is also holomorphic. Another consequence of this 

2 h 

inequality is that the linear functional that evaluates a function /at a point of D is actually continuous on L ' (D). 

2 h 

The Riesz representation theorem implies that the evaluation functional can be represented as an element of L ' (D). 

2 h 

Thus, for every z £ D, there is a function ij^EL' (D) such that 



Hilbert space 



141 



M= [ /(Cto.(C)4rtC) 

Jd 

for all/E L 2,h (D). The integrand 

* (C, *) = £G5 

is known as the Bergman kernel of Z). This integral kernel satisfies a reproducing property 

/(*)= / /(c)Jf(C,*)^(0- 

JD 

A Bergman space is an example of a reproducing kernel Hilbert space, which is a Hilbert space of functions along 

2 

with a kernel K(C,,z) that verifies a reproducing property analogous to this one. The Hardy space H (D) also admits a 
reproducing kernel, known as the Szego kernel Reproducing kernels are common in other areas of mathematics 
as well. For instance, in harmonic analysis the Poisson kernel is a reproducing kernel for the Hilbert space of 
square-integrable harmonic functions in the unit ball. That the latter is a Hilbert space at all is a consequence of the 
mean value theorem for harmonic functions. 

Applications 

Many of the applications of Hilbert spaces exploit the fact that Hilbert spaces support generalizations of simple 
geometric concepts like projection and change of basis from their usual finite dimensional setting. In particular, the 
spectral theory of continuous self-adjoint linear operators on a Hilbert space generalizes the usual spectral 
decomposition of a matrix, and this often plays a major role in applications of the theory to other areas of 
mathematics and physics. 

Sturm-Liouville theory 

In the theory of ordinary differential equations, spectral methods on a 
suitable Hilbert space are used to study the behavior of eigenvalues and 
eigenfunctions of differential equations. For example, the 
Sturm-Liouville problem arises in the study of the harmonics of waves 
in a violin string or a drum, and is a central problem in ordinary 
differential equations. The problem is a differential equation of the 
form 




The overtones of a vibrating string. These are 
eigenfunctions of an associated Sturm-Liouville 
problem. The eigenvalues 1,1/2,1/3,... form the 
(musical) harmonic series. 



d 

dx 



i \ d y 



+ q(x)y = Xw(x)y 



for an unknown function y on an interval [a,b], satisfying general homogeneous Robin boundary conditions 

(ay(a) + a'y'(a) = 0 
\{3y(b) + f3'y'(b) = 0. 

The functions p, q, and w are given in advance, and the problem is to find the function y and constants X for which 
the equation has a solution. The problem only has solutions for certain values of X, called eigenvalues of the system, 



Hilbert space 



142 



and this is a consequence of the spectral theorem for compact operators applied to the integral operator defined by 
the Green's function for the system. Furthermore, another consequence of this general result is that the eigenvalues X 
of the system can be arranged in an increasing sequence tending to infinity P 0 ^ 

Partial differential equations 

[221 

Hilbert spaces form a basic tool in the study of partial differential equations. For many classes of partial 

differential equations, such as linear elliptic equations, it is possible to consider a generalized solution (known as a 

weak solution) by enlarging the class of functions. Many weak formulations involve the class of Sobolev functions, 

which is a Hilbert space. A suitable weak formulation reduces to a geometrical problem the analytic problem of 

finding a solution or, often what is more important, showing that a solution exists and is unique for given boundary 

data. For linear elliptic equations, one geometrical result that ensures unique solvability for a large class of problems 

is the Lax-Milgram theorem. This strategy forms the rudiment of the Galerkin method (a finite element method) for 

T311 

numerical solution of partial differential equations. 

2 

A typical example is the Poisson equation -Au = g with Dirichlet boundary conditions in a bounded domain Q in R . 
The weak formulation consists of finding a function u such that, for all continuously differentiable functions v in £1 
vanishing on the boundary: 



f Vu-Vv= [ 
Jn Jn 



gv. 



This can be recast in terms of the Hilbert space H consisting of functions u such that u, along with its weak 

partial derivatives, are square integrable on Q, and which vanish on the boundary. The question then reduces to 
finding u in this space such that for all v in this space 
a(it, v) = b(v) 

where a is a continuous bilinear form, and b is a continuous linear functional, given respectively by 



a(u, v) = I Vu • Vv, b(v) = / gv. 



Since the Poisson equation is elliptic, it follows from Poincare's inequality that the bilinear form a is coercive. The 
Lax-Milgram theorem then ensures the existence and uniqueness of solutions of this equation. 

Hilbert spaces allow for many elliptic partial differential equations to be formulated in a similar way, and the 
Lax-Milgram theorem is then a basic tool in their analysis. With suitable modifications, similar techniques can be 
applied to parabolic partial differential equations and certain hyperbolic partial differential equations. 



Ergodic theory 




The field of ergodic theory is the study of the long-term behavior of 
chaotic dynamical systems. The protypical case of a field to which 
ergodic theory is applicable is that of thermodynamics in which, 
although the microscopic state of a system is extremely 
complicated — it is impossible to understand the ensemble of individual 
collisions between particles of matter — the average behavior over 
sufficiently long time intervals is tractable. The laws of 
thermodynamics are assertions about such average behavior. In 
particular, one formulation of the zeroth law of thermodynamics 
asserts that over sufficiently long timescales, the only functionally 

independent measurement that one can make of a thermodynamic system in equilibrium is its total energy, in the 
form of temperature. 



The path of a billiard ball in the Bunimovich 
stadium is described by an ergodic dynamical 
system. 



Hilbert space 



143 



An ergodic dynamical system is one for which, apart from the energy — measured by the Hamiltonian — there are no 
other functionally independent conserved quantities on the phase space. More explicitly, suppose that the energy E is 
fixed, and let Q £ be the subset of the phase space consisting of all states of energy E (an energy surface), and let !T 
denote the evolution operator on the phase space. The dynamical system is ergodic if there are no continuous 

non-constant functions onH, such that 

E 

f(T t w) = f(w) 

for all w on Q and all time t. Liouville's theorem implies that there exists a measure [i on the energy surface that is 

invariant under the time translation. As a result, time translation is a unitary transformation of the Hilbert space 

2 

L (£l E ,\i) consisting of square-integrable functions on the energy surface Q with respect to the inner product 

{f,g)L*(n B3 p) = / fgdp> 

J E 

The von Neumann mean ergodic theorem^ states the following: 

• If U is a (strongly continuous) one-parameter semigroup of unitary operators on a Hilbert space H, and P is the 
orthogonal projection onto the space of common fixed points of £/ , {xGH I Ux = x for all t > 0}, then 

1 f T 

Px— lim — / U t xdt. 

T^oc T Jo 

For an ergodic system, the fixed set of the time evolution consists only of the constant functions, so the ergodic 

[32] 2 
theorem implies the following: for any function /G L (Q. 9 \i), 

L 2 -limi f f(T t w)dt= f f(y)dfi(y). 

That is, the long time average of an observable /is equal to its expectation value over an energy surface. 

Fourier analysis 



One of the basic goals of Fourier analysis is to decompose a function 
into a (possibly infinite) linear combination of given basis functions: 
the associated Fourier series. The classical Fourier series associated to 
a function /defined on the interval [0,1] is a series of the form 




Superposition of sinusoidal wave basis functions 
(bottom) to form a sawtooth wave (top) 



Hilbert space 



144 



Spherical harmonics, an orthonormal basis for the 
Hilbert space of square-integrable functions on 
the sphere, shown graphed along the radial 
direction 



71= — OO 

where 



a n = f f{0)t 

JO 



The example of adding up the first few terms in a Fourier series for a sawtooth function is shown in the figure. The 
basis functions are sine waves with wavelengths TJn (n=integer) shorter than the wavelength X of the sawtooth itself 
(except for n=l, the fundamental wave). All basis functions have nodes at the nodes of the sawtooth, but all but the 
fundamental have additional nodes. The oscillation of the summed terms about the sawtooth is called the Gibbs 
phenomenon. 

A significant problem in classical Fourier series asks in what sense the Fourier series converges, if at all, to the 

function/. Hilbert space methods provide one possible answer to this question. The functions e (9) = e mn form 

2 n 
an orthogonal basis of the Hilbert space L ([0,1]). Consequently, any square-integrable function can be expressed as 



a series 



f(0) = J2a n e n (9) : a n = (/, e n ) 



2 

and, moreover, this series converges in the Hilbert space sense (that is, in the L mean). 

The problem can also be studied from the abstract point of view: every Hilbert space has an orthonormal basis, and 
every element of the Hilbert space can be written in a unique way as a sum of multiples of these basis elements. The 
coefficients appearing on these basis elements are sometimes known abstractly as the Fourier coefficients of the 

element of the space P 4 ^ The abstraction is especially useful when it is more natural to use different basis functions 

2 

for a space such as L ([0,1]). In many circumstances, it is desirable not to decompose a function into trigonometric 

T351 

functions, but rather into orthogonal polynomials or wavelets for instance, and in higher dimensions into spherical 
harmonics P 6 ^ 

2 2 
For instance, if e are any orthonormal basis functions of L [0,1], then a given function in L [0,1] can be 

n T371 
approximated as a finite linear combination 

fix) n f n (x) = a^e^x) + a 2 e 2 (x) H h a n e n (x) 

2 

The coefficients {a.} are selected to make the magnitude of the difference \\f - f\\ as small as possible. 
Geometrically, the best approximation is the orthogonal projection of / onto the subspace consisting of all linear 

R81 

combinations of the {e.}, and can be calculated by 



dj = J ej(x)f(x)dx. 



2 

That this formula minimizes the difference Wf-f II is a consequence of Bessel's inequality and Parseval's formula. 



Hilbert space 



145 



In various applications to physical problems, a function can be decomposed into physically meaningful 
eigenfunctions of a differential operator (typically the Laplace operator): this forms the foundation for the spectral 
study of functions, in reference to the spectrum of the differential operator P 9 ^ A concrete physical application 
involves the problem of hearing the shape of a drum: given the fundamental modes of vibration that a drumhead is 
capable of producing, can one infer the shape of the drum itself? ^ The mathematical formulation of this question 
involves the Dirichlet eigenvalues of the Laplace equation in the plane, that represent the fundamental modes of 
vibration in direct analogy with the integers that represent the fundamental modes of vibration of the violin string. 

Spectral theory also underlies certain aspects of the Fourier transform of a function. Whereas Fourier analysis 
decomposes a function defined on a compact set into the discrete spectrum of the Laplacian (which corresponds to 
the vibrations of a violin string or drum), the Fourier transform of a function is the decomposition of a function 
defined on all of Euclidean space into its components in the continuous spectrum of the Laplacian. The Fourier 
transformation is also geometrical, in a sense made precise by the Plancherel theorem, that asserts that it is an 
isometry of one Hilbert space (the "time domain") with another (the "frequency domain"). This isometry property of 
the Fourier transformation is a recurring theme in abstract harmonic analysis, as evidenced for instance by the 
Plancherel theorem for spherical functions occurring in noncommutative harmonic analysis. 




Quantum mechanics 



In the mathematically rigorous formulation of quantum mechanics, 
developed by Paul Dirac^ and John von Neumann ^ , the possible 
states (more precisely, the pure states) of a quantum mechanical system 
are represented by unit vectors (called state vectors) residing in a 
complex separable Hilbert space, known as the state space, well 
defined up to a complex number of norm 1 (the phase factor). In other 
words, the possible states are points in the projectivization of a Hilbert 
space, usually called the complex projective space. The exact nature of 
this Hilbert space is dependent on the system; for example, the position 
and momentum states for a single non-relativistic spin zero particle is 
the space of all square-integrable functions, while the states for the 
spin of a single proton are unit elements of the two-dimensional 
complex Hilbert space of spinors. Each observable is represented by a 
self-adjoint linear operator acting on the state space. Each eigenstate of 

an observable corresponds to an eigenvector of the operator, and the associated eigenvalue corresponds to the value 
of the observable in that eigenstate. 



The orbitals of an electron in a hydrogen atom are 
eigenfunctions of the energy. 



The time evolution of a quantum state is described by the Schrodinger equation, in which the Hamiltonian, the 
operator corresponding to the total energy of the system, generates time evolution. 

The inner product between two state vectors is a complex number known as a probability amplitude. During an ideal 
measurement of a quantum mechanical system, the probability that a system collapses from a given initial state to a 
particular eigenstate is given by the square of the absolute value of the probability amplitudes between the initial and 
final states. The possible results of a measurement are the eigenvalues of the operator — which explains the choice of 
self-adjoint operators, for all the eigenvalues must be real. The probability distribution of an observable in a given 
state can be found by computing the spectral decomposition of the corresponding operator. 

For a general system, states are typically not pure, but instead are represented as statistical mixtures of pure states, or 
mixed states, given by density matrices: self-adjoint operators of trace one on a Hilbert space. Moreover, for general 
quantum mechanical systems, the effects of a single measurement can influence other parts of a system in a manner 
that is described instead by a positive operator valued measure. Thus the structure both of the states and observables 



Hilbert space 



146 



in the general theory is considerably more complicated than the idealization for pure states. 

Heisenberg's uncertainty principle is represented by the statement that the operators corresponding to certain 
observables do not commute, and gives a specific form that the commutator must have. 



Properties 
Pythagorean identity 

Two vectors u and v in a Hilbert space H are orthogonal when (u,v) =0. The notation for this is u J_ v. More 
generally, when S is a subset in H, the notation u ± S means that u is orthogonal to every element from S. 
When u and v are orthogonal, one has 

\\u + v\\ 2 = {u + u + v) = (u, u) + 2 Re(ii, v) + (v, v) = ||ii|| 2 + ||^|| 2 . 
By induction on n, this is extended to any family u ,...,u of n orthogonal vectors, 



ui H h ^|| 2 = ||tii|| 2 H h ||ie, 




Whereas the Pythagorean identity as stated is valid in any inner product space, completeness is required for the 
extension of the Pythagorean identity to series. A series 2 of orthogonal vectors converges in H if and only if the 
series of squares of norms converges, and 

oc oo 

IIX^II 2 = Eii^ii 2 - 

k=0 fc=0 

Furthermore, the sum of a series of orthogonal vectors is independent of the order in which it is taken. 



Parallelogram identity and polarization 



By definition, every Hilbert space is also a Banach space. Furthermore, 
in every Hilbert space the following parallelogram identity holds: 




Geometrically, the parallelogram identity asserts 



that AC 2 + BD 2 = 



2(AB 2 + AD 2 ). In words, the 



sum of the squares of the diagonals is twice the 
sum of the squares of any two adjacent sides. 



|| u + t ;|| 2 + || W -t,|| 2 = 2(H| 2 + ||i;|| 2 ). 
Conversely, every Banach space in which the parallelogram identity holds is a Hilbert space, and the inner product is 
uniquely determined by the norm by the polarization identity J 43 ^ For real Hilbert spaces, the polarization identity is 



{u, v) = - ^|| it + || 2 — \\u — i;|| 2 ^ 



4 

For complex Hilbert spaces, it is 

(Ujv) = -^\\u + v\\ 2 — \\u — v\\ 2 + i\\u + iv\\ 2 — i\\u — iv\\ 2 ^ . 
The parallelogram law implies that any Hilbert space is a uniformly convex Banach space J 44 ^ 



Hilbert space 147 



Best approximation 

If C is a non-empty closed convex subset of a Hilbert space H and x a point in H, there exists a unique point y G C 
which minimizes the distance between x and points in c} 45 ^ 

y eC, \\x - y\\ = dist(x,C) = min{||x - z\\ : z G C}. 
This is equivalent to saying that there is a point with minimal norm in the translated convex set D = C - x. The proof 
consists in showing that every minimizing sequence (<i ) C D is Cauchy (using the parallelogram identity) hence 
converges (using completeness) to a point in D that has minimal norm. More generally, this holds in any uniformly 
convex Banach space J 46 ^ 

When this result is applied to a closed subspace F of H, it can be shown that the point y G F closest to x is 
characterized by'- 47 -' 

y£F, x-y±F. 

This point y is the orthogonal projection of x onto F, and the mapping P : x —> y is linear (see Orthogonal 

t 

complements and projections). This result is especially significant in applied mathematics, especially numerical 
analysis, where it forms the basis of least squares methods. 

In particular, when F is not equal to H, one can find a non-zero vector v orthogonal to F (select x not in F and v = x- 
y). A very useful criterion is obtained by applying this observation to the closed subspace F generated by a subset S 
ofH. 

A subset S of H spans a dense vector subspace if (and only if) the vector 0 is the sole vector v G H orthogonal 
to S. 

Duality 

The dual space H* is the space of all continuous linear functions from the space H into the base field. It carries a 
natural norm, defined by 

IMI = su p 

\\x\\=i,xeH 

This norm satisfies the parallelogram law, and so the dual space is also an inner product space. The dual space is also 
complete, and so it is a Hilbert space in its own right. 

The Riesz representation theorem affords a convenient description of the dual. To every element u of H, there is a 
unique element cp of H*, defined by 

tp u (x) = {x,u). 

The mapping U \— > ip u is an antilinear mapping from H to H* . The Riesz representation theorem states that this 

T481 * 

mapping is an antilinear isomorphism. Thus to every element cp of the dual H there exists one and only one in 
H such that 

{x,u v ) = tp{x) 
for all x G H. The inner product on the dual space H* satisfies 

The reversal of order on the right-hand side restores linearity in cp from the antilinearity of u . In the real case, the 
antilinear isomorphism from H to its dual is actually an isomorphism, and so real Hilbert spaces are naturally 
isomorphic to their own duals. 

The representing vector is obtained in the following way. When cp ± 0, the kernel F = ker cp is a closed vector 
subspace of H, not equal to H, hence there exists a non-zero vector v orthogonal to F. The vector u is a suitable 
scalar multiple Av of v. The requirement that ^(v) = (v, u) yields 



u = {v, v) 1 ip{v) v. 



Hilbert space 



148 



This correspondence cp <-» u is exploited by the bra-ket notation popular in physics. It is common in physics to 
assume that the inner product, denoted by (x\y), is linear on the right, 

(x\y) = (y,x). 

The result (x\y) can be seen as the action of the linear functional (x\ (the bra) on the vector \y) (the kef). 

The Riesz representation theorem relies fundamentally not just on the presence of an inner product, but also on the 
completeness of the space. In fact, the theorem implies that the topological dual of any inner product space can be 
identified with its completion. An immediate consequence of the Riesz representation theorem is also that a Hilbert 
space H is reflexive, meaning that the natural map from H into its double dual space is an isomorphism. 

Weakly convergent sequences 

In a Hilbert space H, a sequence {x } is weakly convergent to a vector xE// when 
lim(x n , v) = (x, v) 

71 

for every v £ H. 

For example, any orthonormal sequence {f^} converges weakly to 0, as a consequence of Bessel's inequality. Every 
weakly convergent sequence {x } is bounded, by the uniform boundedness principle. 

Conversely, every bounded sequence in a Hilbert space admits weakly convergent subsequences (Alaoglu's 
theorem)J 49] This fact may be used to prove minimization results for continuous convex functionals, in the same 
way that the Bolzano- Weierstrass theorem is used for continuous functions on R^. Among several variants, one 
simple statement is as follows: t50] 

Iff: H — » R is a convex continuous function such that/(x) tends to +°o when llxll tends to ©o, then /admits a 
minimum at some point x Q G H. 

This fact (and its various generalizations) are fundamental for direct methods in the calculus of variations. 
Minimization results for convex functionals are also a direct consequence of the slightly more abstract fact that 
closed bounded convex subsets in a Hilbert space H are weakly compact, since H is reflexive. The existence of 
weakly convergent subsequences is a special case of the Eberlein-Smulian theorem. 

Banach space properties 

Any general property of Banach spaces continues to hold for Hilbert spaces. The open mapping theorem states that a 
continuous surjective linear transformation from one Banach space to another is an open mapping meaning that it 
sends open sets to open sets. A corollary is the bounded inverse theorem, that a continuous and bijective linear 
function from one Banach space to another is an isomorphism (that is, a continuous linear map whose inverse is also 
continuous). This theorem is considerably simpler to prove in the case of Hilbert spaces than in general Banach 
spaces.^ ^ The open mapping theorem is equivalent to the closed graph theorem, which asserts that a function from 
one Banach space to another is continuous if and only if its graph is a closed set. In the case of Hilbert spaces, this 
is basic in the study of unbounded operators (see closed operator). 

The (geometrical) Hahn-Banach theorem asserts that a closed convex set can be separated from any point outside it 
by means of a hyperplane of the Hilbert space. This is an immediate consequence of the best approximation 
property: if y is the element of a closed convex set F closest to x, then the separating hyperplane is the plane 
perpendicular to the segment xy passing through its midpoint. 

The distortion problem on Hilbert space asks whether or not every real valued Lipschitz function / defined on the 
sphere of a separable and infinite dimensional Hilbert space JJ stabilizes on the sphere of an infinite dimensional 
subspace, i.e. whether there is a real number a G R so that for every 6 > 0 there is an infinite dimensional subspace 
Y°f H > so mat ' a_ /(y)'< f° r al l y ^ Y, with llyll=l. This problem was solved negatively by E. Odell and 
Th.Schlumprecht (1994). 



Hilbert space 



149 



Operators on Hilbert spaces 
Bounded operators 

The continuous linear operators A : —> from a Hilbert space to a second Hilbert space are bounded in 
the sense that they map bounded sets to bounded sets. Conversely, if an operator is bounded, then it is continuous. 
The space of such bounded linear operators has a norm, the operator norm given by 

\\A\\ = sup{ \\Ax\\ : < 1}. 

The sum and the composite of two bounded linear operators is again bounded and linear. For y in H^, the map that 
sends x G H to <Ax, y> is linear and continuous, and according to the Riesz representation theorem can therefore be 
represented in the form 

(x,A*y) = (Ax,y) 

for some vector A* y in This defines another bounded linear operator A* : H^—> H^ 9 the adjoint of A. One can see 
that A** = A. 

The set B(H) of all bounded linear operators on H, together with the addition and composition operations, the norm 
and the adjoint operation, is a C -algebra, which is a type of operator algebra. 

An element A of B(H) is called self -adjoint or Hermitian if A* = A. If A is Hermitian and (Ax, x) > 0 for every x, 
then A is called non-negative, written A > 0; if equality holds only when x = 0, then A is called positive. The set of 
self adjoint operators admits a partial order, in which A > B if A - B > 0. If A has the form B*B for some B, then A is 
non-negative; if B is invertible, then A is positive. A converse is also true in the sense that, for a non-negative 
operator A, there exists a unique non-negative square root B such that 

A = B 2 = B*B. 

In a sense made precise by the spectral theorem, self-adjoint operators can usefully be thought of as operators that 
are "real". An element A of B(H) is called normal if A*A = A A*. Normal operators decompose into the sum of a 
self-adjoint operators and an imaginary multiple of a self adjoint operator 

A + A* .{A- A*) 
A ~^^ + l 2i 

that commute with each other. Normal operators can also usefully be thought of in terms of their real and imaginary 
parts. 

An element U of B(H) is called unitary if U is invertible and its inverse is given by U*. This can also be expressed by 
requiring that U be onto and ( Ux, Uy) = (x, y) for all x and y in H. The unitary operators form a group under 
composition, which is the isometry group of H. 

An element of B(H) is compact if it sends bounded sets to relatively compact sets. Equivalently, a bounded operator 
Tis compact if, for any bounded sequence {jc }, the sequence {Tx } has a convergent subsequence. Many integral 
operators are compact, and in fact define a special class of operators known as Hilbert-Schmidt operators that are 
especially important in the study of integral equations. Fredholm operators are those which differ from a compact 
operator by a multiple of the identity, and are equivalently characterized as operators with a finite dimensional kernel 
and cokernel. The index of a Fredholm operator Tis defined by 

index T = dim ker T — dim coker T. 

The index is homotopy invariant, and plays a deep role in differential geometry via the Atiyah-Singer index 
theorem. 



Hilbert space 



150 



Unbounded operators 

Unbounded operators are also tractable in Hilbert spaces, and have important applications to quantum mechanics P 4 ^ 
An unbounded operator T on a Hilbert space H is defined to be a linear operator whose domain D(T) is a linear 
subspace of H. Often the domain D(T) is a dense subspace of H, in which case T is known as a densely-defined 
operator. 

The adjoint of a densely defined unbounded operator is defined in essentially the same manner as for bounded 
operators. Self-adjoint unbounded operators play the role of the observable s in the mathematical formulation of 
quantum mechanics. Examples of self-adjoint unbounded operators on the Hilbert space L 2 (R) are:'" 55 -' 

• A suitable extension of the differential operator 

(Af)(x)=i±f(x), 

where / is the imaginary unit and /is a differentiable function of compact support. 

• The multiplication-by-x operator: 

(Bf)(x) = xf(x). 

These correspond to the momentum and position observables, respectively. Note that neither A nor B is defined on 

all of H, since in the case of A the derivative need not exist, and in the case of B the product function need not be 

2 

square integrable. In both cases, the set of possible arguments form dense subspaces of L (R). 



Constructions 
Direct sums 

Two Hilbert spaces and H 2 can be combined into another Hilbert space, called the (orthogonal) direct sum, t56] 
and denoted 

h x e h 2 , 

consisting of the set of all ordered pairs (x^, x^) where x. G H., i = 1,2, and inner product defined by 
More generally, if H. is a family of Hilbert spaces indexed by / G /, then the direct sum of the H., denoted 
iei 

consists of the set of all indexed families 

x=(x i £H i \i£l)£l[H i 

iei 

in the Cartesian product of the H such that 

^ \\xi\\ 2 < OC. 
iei 

The inner product is defined by 
iei 

Each of the H. is included as a closed subspace in the direct sum of all of the H.. Moreover, the H. are pairwise 
orthogonal. Conversely, if there is a system of closed subspaces V , i G /, in a Hilbert space H which are pairwise 
orthogonal and whose union is dense in H, then H is canonically isomorphic to the direct sum of V.. In this case, H is 
called the internal direct sum of the V. A direct sum (internal or external) is also equipped with a family of 
orthogonal projections E. onto the ith direct summand These projections are bounded, self-adjoint, idempotent 
operators which satisfy the orthogonality condition 



Hilbert space 



151 



EkE 3 = 0, i± j. 

The spectral theorem for compact self-adjoint operators on a Hilbert space H states that H splits into an orthogonal 
direct sum of the eigenspaces of an operator, and also gives an explicit decomposition of the operator as a sum of 
projections onto the eigenspaces. The direct sum of Hilbert spaces also appears in quantum mechanics as the Fock 
space of a system containing a variable number of particles, where each Hilbert space in the direct sum corresponds 
to an additional degree of freedom for the quantum mechanical system. In representation theory, the Peter- Weyl 
theorem guarantees that any unitary representation of a compact group on a Hilbert space splits as the direct sum of 
finite-dimensional representations . 

Tensor products 

If if 1 and H^, then one defines an inner product on the (ordinary) tensor product as follows. On simple tensors, let 

This formula then extends by sesquilinearity to an inner product on Hi ® fl*2- The Hilbertian tensor product of 
and if , sometimes denoted by H\®H2> * s me Hilbert space obtained by completing H\ ® i?2f° r me metric 
associated to this inner product. 

2 2 
An example is provided by the Hilbert space L ([0, 1]). The Hilbertian tensor product of two copies of L ([0, 1]) is 

2 2 2 

isometrically and linearly isomorphic to the space L ([0, 1] ) of square-integrable functions on the square [0, 1] . 

This isomorphism sends a simple tensor fi ® /j^ 0 me function 

(s,t) » Ms) f 2 (t) 

on the square. 

T581 

This example is typical in the following sense. Associated to every simple tensor product X\ ® rz^is the rank 
one operator 

x* € H* — > x*{xi) X2 

from the (continuous) dual if * to if . This mapping defined on simple tensors extends to a linear identification 
between Hi ® i?2 an d me space of finite rank operators from H* to H^. This extends to a linear isometry of the 
Hilbertian tensor product i^gj/i^with me Hilbert space HS{H*, of Hilbert-Schmidt operators from if * to 

Orthonormal bases 

The notion of an orthonormal basis from linear algebra generalizes over to the case of Hilbert spaces. In a Hilbert 
space if, an orthonormal basis is a family {e k ) k e B of elements of H satisfying the conditions: 

1 . Orthogonality: Every two different elements of B are orthogonal: (e , e .)= 0 for all k, j in B with k * j. 

2. Normalization: Every element of the family has norm 1:11^11 = 1 for all k in B. 

3. Completeness: The linear span of the family e , k G B, is dense in H. 

A system of vectors satisfying the first two conditions basis is called an orthonormal system or an orthonormal set 
(or an orthonormal sequence if B is countable). Such a system is always linearly independent. Completeness of an 
orthonormal system of vectors of a Hilbert space can be equivalently restated as: 

if (v, e^j = 0 for all k € B and some v G if then v = 0. 

This is related to the fact that the only vector orthogonal to a dense linear subspace is the zero vector, for if S is any 
orthonormal set and v is orthogonal to S, then v is orthogonal to the closure of the linear span of S, which is the 
whole space. 

Examples of orthonormal bases include: 

• the set {(1,0,0), (0,1,0), (0,0,1)} forms an orthonormal basis of R with the dot product; 



Hilbert space 



152 



2 

• the sequence [f^ : n G Z} with/^(x) = exp(2jtmx) forms an orthonormal basis of the complex space L ([0,1]); 

In the infinite-dimensional case, an orthonormal basis will not be a basis in the sense of linear algebra; to distinguish 
the two, the latter basis is also called a Hamel basis. That the span of the basis vectors is dense implies that every 
vector in the space can be written as the sum of an infinite series, and the orthogonality implies that this 
decomposition is unique. 

Sequence spaces 

2 

The space D of square- summable sequences of complex numbers has an orthonormal basis 
ei = (l,0,0,...) 
e 2 = (0,l,0,...) 

More generally, if B is any set, then one can form a Hilbert space of sequences with index set B, defined by 

£ 2 (B) = {x : B^C | \*( b )\ 2 < °°}- 

beB 

The summation over B is here defined by 

Y,\*(b)\ 2 =su P f:\x(b n )\i 

beB n=i 

the supremum being taken over all finite subsets of B. It follows that, in order for this sum to be finite, every element 

2 

of D (B) has only countably many nonzero terms. This space becomes a Hilbert space with the inner product 



v) = S x ( b )y( b ) 



beB 

2 

for all x and y in D (B). Here the sum also has only countably many nonzero terms, and is unconditionally convergent 

by the Cauchy-Schwarz inequality. 

2 

An orthonormal basis of D (B) is indexed by the set B, given by 



e b (b f ) = 



1 if b = b' 
0 otherwise. 



Bessel's inequality and Parseval's formula 

Let/ 1? . . .,f n be a finite orthonormal system in H. For an arbitrary vector x in H, let 

71 

i=i 

Then {x,f^ = f° r every k = 1, . . ., n. It follows that x - y is orthogonal to each/^, hence x - y is orthogonal to y. 
Using the Pythagorean identity twice, it follows that 

||x|| 2 = ||^ - + II2/H 2 > ||y|| 2 = f: |{^ /: ,->| 2 . 

Let {f. },/€/, be an arbitrary orthonormal system in H. Applying the preceding inequality to every finite subset / of 
/ gives the Bessel inequality^ 

Y,\(x,m 2 <\\x\\\ xGH 
iei 

(according to the definition of the sum of an arbitrary family of non-negative real numbers). 

Geometrically, Bessel's inequality implies that the orthogonal projection of x onto the linear subspace spanned by the 
f. has norm that does not exceed that of x. In two dimensions, this is the assertion that the length of the leg of a right 



Hilbert space 



153 



triangle may not exceed the length of the hypotenuse. 

Bessel's inequality is a stepping stone to the more powerful Parseval identity which governs the case when Bessel's 
inequality is actually an equality. If {e^ k £ ^ is an orthonormal basis of H, then every element x of H may be written 

as 

keB 

Even if B is uncountable, Bessel's inequality guarantees that the expression is well-defined and consists only of 
countably many nonzero terms. This sum is called the Fourier expansion of x, and the individual coefficients (x,e^ 
are the Fourier coefficients of x. Parseval's formula is then 

Hx|| 2 = eimi 2 . 

Conversely, if {e^\ is an orthonormal set such that Parseval's identity holds for every x, then {e^\ is an orthonormal 
basis. 



Hilbert dimension 

As a consequence of Zorn's lemma, every Hilbert space admits an orthonormal basis; furthermore, any two 
orthonormal bases of the same space have the same cardinality, called the Hilbert dimension of the space/ 61] For 

2 

instance, since D (B) has an orthonormal basis indexed by B, its Hilbert dimension is the cardinality of B (which may 
be a finite integer, or a countable or uncountable cardinal number). 

2 

As a consequence of Parseval's identity, if {e^ k g B is an orthonormal basis of H, then the map O : H — » £ (B) 
defined by = (( x > e ])) keB * s an isometric isomorphism of Hilbert spaces: it is a bijective linear mapping such that 

(^(x)^(y)) £ 2 {B) = {x,y) H 
for all x and y in H. The cardinal number of B is the Hilbert dimension of H. Thus every Hilbert space is 
isometrically isomorphic to a sequence space £ 2 {B) for some set B. 

Separable spaces 

A Hilbert space is separable if and only if it admits a countable orthonormal basis. All infinite-dimensional separable 
Hilbert spaces are therefore isometrically isomorphic to . 

In the past, Hilbert spaces were often required to be separable as part of the definition J 62 ^ Most spaces used in 
physics are separable, and since these are all isomorphic to each other, one often refers to any infinite-dimensional 
separable Hilbert space as "the Hilbert space" or just "Hilbert space" J 63 ^ Even in quantum field theory, most of the 
Hilbert spaces are in fact separable, as stipulated by the Wightman axioms. However, it is sometimes argued that 
non-separable Hilbert spaces are also important in quantum field theory, roughly because the systems in the theory 
possess an infinite number of degrees of freedom and any infinite Hilbert tensor product (of spaces of dimension 
greater than one) is non- separable J 64 ^ For instance, a bosonic field can be naturally thought of as an element of a 
tensor product whose factors represent harmonic oscillators at each point of space. From this perspective, the natural 
state space of a boson might seem to be a non-separable space J 64 ^ However, it is only a small separable subspace of 
the full tensor product that can contain physically meaningful fields (on which the observables can be defined). 
Another non- separable Hilbert space models the state of an infinite collection of particles in an unbounded region of 
space. An orthonormal basis of the space is indexed by the density of the particles, a continuous parameter, and since 
the set of possible densities is uncountable, the basis is not countable J 64 ^ 



Hilbert space 



154 



Orthogonal complements and projections 

If S is a subset of a Hilbert space H, the set of vectors orthogonal to S is defined by 

S 1 - = {x G H : (x, 5) = 0 Vs G 5} . 

is a closed subspace of H and so forms itself a Hilbert space. If V is a closed subspace of H, then V 1 is called the 
orthogonal complement of V. In fact, every xinH can then be written uniquely as x = v + w, with v in V and w in 
Therefore, 7f is the internal Hilbert direct sum of V and V 1 . 

The linear operator : H — » // which maps x to v is called the orthogonal projection onto V. There is a natural 
one-to-one correspondence between the set of all closed subspaces of H and the set of all bounded self-adjoint 

2 

operators P such that P =P. Specifically, 

Theorem. The orthogonal projection P y is a self-adjoint linear operator on H of norm < 1 with the property 

2 2 
P = P^. Moreover, any self-adjoint linear operator E such that E = E is of the form P^, where V is the range 

-vll. 

[65] 



of E. For every x in H, P is the unique element v of V which minimizes the distance \\x - vll. 



This provides the geometrical interpretation of P^(x): it is the best approximation to x by elements of V. 

An operator P such that P = P 2 = P* is called an orthogonal projection. The orthogonal projection P y onto a closed 
subspace V of H is the adjoint of the inclusion mapping 

i v : V -> ff, 
meaning that 

<i v x, 2/) = (x, iVy) 

for all x G // and y G V. Projections and P y are called mutually orthogonal if PjjPy = 0- This is equivalent to U 
and V being orthogonal as subspaces of H. As a result, the sum of the two projections P^and P^is only a projection 
if U and V are orthogonal to each other, and in that case P u + P y = P u+y - The composite P^Py is generally not a 
projection; in fact, the composite is a projection if and only if the two projections commute, and in that case 

P U P V = P UnV 

The operator norm of a projection P onto a non-zero closed subspace is equal to one: 
llPxIl 

||jP|| = sup = 1. 

xEH,x^0 \\x\\ 

2 

Every closed subspace V of a Hilbert space is therefore the image of an operator P of norm one such that P = P. In 
fact this property characterizes Hilbert spaces 

• A Banach space of dimension higher than 2 is (isometrically) a Hilbert space if and only if, to every closed 
subspace V, there is an operator P y of norm one whose image is V such that Py = P v . 

While this result characterizes the metric structure of a Hilbert space, the structure of a Hilbert space as a topological 
vector space can itself be characterized in terms of the presence of complementary subspaces: ^ 

• A Banach space X is topologically and linearly isomorphic to a Hilbert space if and only if, to every closed 
subspace V, there is a closed subspace W such that X is equal to the internal direct sum V © W • 

The orthogonal complement satisfies some more elementary results. It is a monotone function in the sense that if 
U C V , then V 1 " C U 1 " with equality holding if and only if V is contained in the closure of U. This result is a 
special case of the Hahn-Banach theorem. The closure of a subspace can be completely characterized in terms of the 
orthogonal complement: If V is a subspace of H, then the closure of V is equal to y^ 1 - • The orthogonal 
complement is thus a Galois connection on the partial order of subspaces of a Hilbert space. In general, the 
orthogonal complement of a sum of subspaces is the intersection of the orthogonal complements:^ 68 ^ 
(Ei Vi) 1 ' = Hi Vi~ • If the V i are in addition closed, then = (Hi ■ 



Hilbert space 



155 



Spectral theory 

There is a well-developed spectral theory for self-adjoint operators in a Hilbert space, that is roughly analogous to 
the study of symmetric matrices over the reals or self-adjoint matrices over the complex numbers J 69 ^ In the same 
sense, one can obtain a "diagonalization" of a self-adjoint operator as a suitable sum (actually an integral) of 
orthogonal projection operators. 

The spectrum of an operator T, denoted o(T) is the set of complex numbers X such that T-X lacks a continuous 
inverse. If T is bounded, then the spectrum is always a compact set in the complex plane, and lies inside the disc 
H<||T||. If Tis self-adjoint, then the spectrum is real. In fact, it is contained in the interval [m,M] where 

m — inf (Tx, x), M— sup (Tx,x). 
11*11=1 H|=i 

Moreover, m and M are both actually contained within the spectrum. 
The eigenspaces of an operator T are given by 
H\ = ker(T — A). 

Unlike with finite matrices, not every element of the spectrum of T must be an eigenvalue: the linear operator T - X 
may only lack an inverse because it is not surjective. Elements of the spectrum of an operator in the general sense are 
known as spectral values. Since spectral values need not be eigenvalues, the spectral decomposition is often more 
subtle than in finite dimensions. 

However, the spectral theorem of a self-adjoint operator T takes a particularly simple form if, in addition, T is 
assumed to be a compact operator. The spectral theorem for compact self-adjoint operators states: ^ 

• A compact self-adjoint operator Thas only countably (or finitely) many spectral values. The spectrum of T has no 
limit point in the complex plane except possibly zero. The eigenspaces of T decompose H into an orthogonal 
direct sum: 

\E<r(T) 

Moreover, if E denotes the orthogonal projection onto the eigenspace H , then 

A, A 

T= £ XE X 

where the sum converges with respect to the norm on B(H). 

This theorem plays a fundamental role in the theory of integral equations, as many integral operators are compact, in 
particular those that arise from Hilbert- Schmidt operators. 

The general spectral theorem for self-adjoint operators involves a kind of operator- valued Riemann-Stieltjes 
integral, rather than an infinite summation. 1 The spectral family associated to T associates to each real number X an 
operator E^, which is the projection onto the nullspace of the operator (T — A) + , where the positive part of a 
self-adjoint operator is defined by 

The operators E are monotone increasing relative to the partial order defined on self-adjoint operators; the 

A 

eigenvalues correspond precisely to the jump discontinuities. One has the spectral theorem, which asserts 

T= I XdE x . 

Jr 

The integral is understood as a Riemann-Stieltjes integral, convergent with respect to the norm on B(//). In 
particular, one has the ordinary scalar- valued integral representation 

(Tx : y) = / Xd{E x x,y). 



Hilbert space 



156 



A somewhat similar spectral decomposition holds for normal operators, although because the spectrum may now 
contain non-real complex numbers, the operator-valued Stieltjes measure dE. must instead be replaced by a 

A 

resolution of the identity. 

A major application of spectral methods is the spectral mapping theorem, which allows one to apply to a self-adjoint 
operator T any continuous complex function /defined on the spectrum of T by forming the integral 

f(T)= f f(X)dE x . 

Ja(T) 

T721 

The resulting continuous functional calculus has applications in particular to pseudodifferential operators. 

The spectral theory of unbounded self-adjoint operators is only marginally more difficult than for bounded operators. 
The spectrum of an unbounded operator is defined in precisely the same way as for bounded operators: X is a spectral 
value if the resolvent operator 

R\ = (T - A) -1 

fails to be a well-defined continuous operator. The self-adjointness of T still guarantees that the spectrum is real. 
Thus the essential idea of working with unbounded operators is to look instead at the resolvent R where X is 
non-real. This is a bounded normal operator, which admits a spectral representation that can then be transferred to a 
spectral representation of T itself. A similar strategy is used, for instance, to study the spectrum of the Laplace 
operator: rather than address the operator directly, one instead looks as an associated resolvent such as a Riesz 
potential or Bessel potential. 

T731 

A precise version of the spectral theorem which holds in this case is: 

Given a densely-defined self-adjoint operator T on a Hilbert space H, there corresponds a unique resolution of 
the identity E on the Borel sets of R, such that 

(Tx,y)= [ XdE x , y (X) 
Jr 

for all xG D(T) and y €H. The spectral measure E is concentrated on the spectrum of T. 
There is also a version of the spectral theorem that applies to unbounded normal operators. 



Notes 

[I] Marsden 1974, §2.8 

[2] The mathematical material in this section can be found in any good textbook on functional analysis, such as Dieudonne (1960), Hewitt & 

Stromberg (1965), Reed & Simon (1980) or Rudin (1980). 
[3] In some conventions, inner products are linear in their second arguments instead. 
[4] Dieudonne 1960, §6.2 
[5] Dieudonne 1960 

[6] Largely from the work of Hermann Grassmann, at the urging of August Ferdinand Mobius (Boyer & Merzbach 1991, pp. 584-586). The first 
modern axiomatic account of abstract vector spaces ultimately appeared in Giuseppe Peano's 1888 account (Grattan-Guinness 2000, §5.2.2; 
O'Connor & Robertson 1996). 

[7] A detailed account of the history of Hilbert spaces can be found in Bourbaki 1987. 

[8] Schmidt 1908 

[9] Titchmarsh 1946, §IX.l 

[10] Lebesgue 1904. Further details on the history of integration theory can be found in Bourbaki (1987) and Saks (2005). 

[II] Bourbaki 1987. 

[12] Dunford & Schwartz 1958, §IV.16 

[13] In Dunford & Schwartz (1958, §IV.16), the result that every linear functional on L 2 [0,1] is represented by integration is jointly attributed to 
Frechet (1907) and Riesz (1907). The general result, that the dual of a Hilbert space is identified with the Hilbert space itself, can be found in 
Riesz (1934). 

[14] von Neumann 1929. 

[15] Kline 1972, p. 1092 

[16] Hilbert, Nordheim & von Neumann 1927. 

[17] Weyll931. 



Hilbert space 



157 



[18] Prugovecki 1981, pp. 1-10. 

[19] von Neumann 1932 

[20] Halmos 1957, Section 42. 

[21] Hewitt & Stromberg 1965. 

[22] Bers, John & Schechter 1981. 

[23] Giusti2003. 

[24] Stein 1970 

[25] Details can be found in Warner (1983). 

[26] A general reference on Hardy spaces is the book Duren (1970). 

[27] Krantz 2002, §1.4 

[28] Krantz 2002, §1.5 

[29] Young 1987, Chapter 9. 

[30] The eigenvalues of the Fredholm kernel are l/X, which tend to zero. 

[31] More detail on finite element methods from this point of view can be found in Brenner & Scott (2005). 

[32] Reed & Simon 1980 

[33] A treatment of Fourier series from this point of view is available, for instance, in Rudin (1987) or Folland (2009). 

[34] Halmos 1957, §5 

[35] Bachman, Narici & Beckenstein 2000 

[36] Stein & Weiss 1971, §IV.2. 

[37] Lancos 1988, pp. 212-213 

[38] Lanczos 1988, Equation 4-3.10 

[39] The classic reference for spectral methods is Courant & Hilbert 1953. A more up-to-date account is Reed & Simon 1975. 

[40] Kacl966 

[41] Dirac 1930 

[42] von Neumann 1955 

[43] Young 1988, p. 23. 

[44] Clarkson 1936. 

[45] Rudin 1987, Theorem 4.10 

[46] Dunford & Schwartz 1958, II.4.29 

[47] Rudin 1987, Theorem 4. 1 1 

[48] Weidmann 1980, Theorem 4.8 

[49] Weidmann 1980, §4.5 

[50] Buttazzo, Giaquinta & Hildebrandt 1998, Theorem 5.17 

[51] Halmos 1982, Problem 52, 58 

[52] Rudin 1973 

[53] Treves 1967, Chapter 18 

[54] See Prugovecki (1981), Reed & Simon (1980, Chapter VIII) and Folland (1989). 

[55] Prugovecki 1981, III, §1.4 

[56] Dunford & Schwartz 1958, IV.4.17-18 

[57] Weidmann 1980, §3.4 

[58] Kadison & Ringrose 1983, Theorem 2.6.4 

[59] Dunford & Schwartz 1958, §IV.4. 

[60] For the case of finite index sets, see, for instance, Halmos 1957, §5. For infinite index sets, see Weidmann 1980, Theorem 3.6. 

[61] Levitan 2001. Many authors, such as Dunford & Schwartz (1958, §IV.4), refer to this just as the dimension. Unless the Hilbert space is finite 

dimensional, this is not the same thing as its dimension as a linear space (the cardinality of a Hamel basis). 

[62] Prugovecki 1981,1, §4.2 

[63] von Neumann (1955) defines a Hilbert space via a countable Hilbert basis, which amounts to an isometric isomorphism with £^ . The 

convention still persists in most rigorous treatments of quantum mechanics; see for instance Sobrino 1996, Appendix B. 

[64] Streater & Wightman 1964, pp. 86-87 

[65] Young 1988, Theorem 15.3 

[66] Kakutanil939 

[67] Lindenstrauss & Tzafriri 1971 

[68] Halmos 1957, §12 

[69] A general account of spectral theory in Hilbert spaces can be found in Riesz & Sz Nagy (1990). A more sophisticated account in the 

language of C*-algebras is in Rudin (1973) or Kadison & Ringrose (1997) 

[70] See, for instance, Riesz & Sz Nagy (1990, Chapter VI) or Weidmann 1980, Chapter 7. This result was already known to Schmidt (1907) in 

the case of operators arising from integral kernels. 

[71] Riesz & Sz Nagy 1990, §§107-108 

[72] Shubinl987 



Hilbert space 



158 



[73] Rudin 1973, Theorem 13.30. 

References 

• Bachman, George; Narici, Lawrence; Beckenstein, Edward (2000), Fourier and wavelet analysis, Universitext, 
Berlin, New York: Springer- Verlag, MR1729490, ISBN 978-0-387-98899-3. 

• Bers, Lipman; John, Fritz; Schechter, Martin (1981), Partial differential equations, American Mathematical 
Society, ISBN 0821800493. 

• Bourbaki, Nicolas (1986), Spectral theories, Elements of mathematics, Berlin: Springer- Verlag, ISBN 
0201007673. 

• Bourbaki, Nicolas (1987), Topological vector spaces, Elements of mathematics, Berlin: Springer- Verlag, 
ISBN 978-3540136279. 

• Boyer, Carl Benjamin; Merzbach, Uta C (1991), A History of Mathematics (2nd ed.), John Wiley & Sons, Inc., 
ISBN 0-471-54397-7. 

• Brenner, S.; Scott, R. L. (2005), The Mathematical Theory of Finite Element Methods (2nd ed.), Springer, 
ISBN 0-3879-5451-1. 

• Buttazzo, Giuseppe; Giaquinta, Mariano; Hildebrandt, Stefan (1998), One-dimensional variational problems, 
Oxford Lecture Series in Mathematics and its Applications, 15, The Clarendon Press Oxford University Press, 
MR1694383, ISBN 978-0-19-850465-8. 

• Clarkson, J. A. (1936), "Uniformly convex spaces" (http://www.jstor.org/stable/1989630), Trans. Amer. Math. 
Soc. 40 (3): 396-414, doi: 10.2307/1989630. 

• Courant, Richard; Hilbert, David (1953), Methods of Mathematical Physics, Vol. I, Interscience. 

• Dieudonne, Jean (1960), Foundations of Modern Analysis, Academic Press. 

• Dirac, P.A.M. (1930), The Principles of Quantum Mechanics, Oxford: Clarendon Press. 

• Dunford, N.; Schwartz, J.T. (1958), Linear operators, Parts I and II, Wiley-Interscience. 

• Duren, P. (1970), Theory of H p -Spaces, New York: Academic Press. 

• Folland, Gerald B. (2009), Fourier analysis and its application (http://books.google.com/ 
books?as_isbn=082 1847902) (Reprint of Wadsworth and Brooks/Cole 1992 ed.), American Mathematical Society 
Bookstore, ISBN 0821847902. 

• Folland, Gerald B. (1989), Harmonic analysis in phase space, Annals of Mathematics Studies, 122, Princeton 
University Press, ISBN 0-691-08527-7. 

• Frechet, Maurice (1907), "Sur les ensembles de fonctions et les operations lineaires", C. R. Acad. Sci. Paris 144: 
1414-1416. 

• Frechet, Maurice (1904-1907), Sur les operations lineaires. 

• Giusti, Enrico (2003), Direct Methods in the Calculus of Variations, World Scientific, ISBN 981-238-043-4. 

• Grattan-Guinness, Ivor (2000), The search for mathematical roots, 1870-1940, Princeton Paperbacks, Princeton 
University Press, MR1807717, ISBN 978-0-691-05858-0. 

• Halmos, Paul (1957), Introduction to Hilbert Space and the Theory of Spectral Multiplicity, Chelsea Pub. Co 

• Halmos, Paul (1982), A Hilbert Space Problem Book, Springer- Verlag, ISBN 

0387906851. 

• Hewitt, Edwin; Stromberg, Karl (1965), Real and Abstract Analysis, Springer- Verlag. 

• Hilbert, David; Nordheim, Lothar (Wolfgang); von Neumann, John (1927), "Uber die Grundlagen der 
Quantenmechanik" (http://dz-srvl.sub.uni-goettingen.de/sub/digbib/loader?ht=VIEW&did=D27779), 
Mathematische Annalen 98: 1-30, doi:10.1007/BF01451579. 

• Kac, Mark (1966), "Can one hear the shape of a drum?" (http://jstor.org/stable/2313748), American 
Mathematical Monthly 73 (4, part 2): 1-23, doi: 10.2307/23 13748. 

• Kadison, Richard V.; Ringrose, John R. (1997), Fundamentals of the theory of operator algebras. Vol. I, Graduate 
Studies in Mathematics, 15, Providence, R.L: American Mathematical Society, MR1468229, 



Hilbert space 



159 



ISBN 978-0-8218-0819-1. 

• Kakutani, Shizuo (1939), "Some characterizations of Euclidean space", Jap. J. Math. 16: 93-97, MR0000895. 

• Kline, Morris (1972), Mathematical thought from ancient to modern times, Volume 3 (3rd ed.), Oxford University 
Press (published 1990), ISBN 978-0195061376. 

• Kolmogorov, Andrey; Fomin, Sergei V. (1970), Introductory Real Analysis (Revised English edition, trans, by 
Richard A. Silverman (1975) ed.), Dover Press, ISBN 0-486-61226-0. 

• Krantz, Steven G. (2002), Function Theory of Several Complex Variables, Providence, R.I.: American 
Mathematical Society, ISBN 978-0-8218-2724-6. 

• Lanczos, Cornelius (1988), Applied analysis (http://books. google. com/books ?as_isbn=04 8 665 65 6X) (Reprint 
of 1956 Prentice-Hall ed.), Dover Publications, ISBN 048665656X. 

• Lindenstrauss, J.; Tzafriri, L. (1971), "On the complemented subspaces problem", Israel Journal of Mathematics 
9: 263-269, doi:10.1007/BF02771592, MR0276734, ISSN 0021-2172. 

• O'Connor, John J.; Robertson, Edmund F. (1996), "Abstract linear spaces" (http://www-history.mcs. st-andrews. 
ac.uk/HistTopics/Abstract_linear_spaces.html), MacTutor History of Mathematics archive, University of St 
Andrews.. 

• Lebesgue, Henri (1904), Legons sur Vintegration et la recherche des fonctions primitives (http: //books. google. 
com/?id=VfUKAAAAYAAJ&dq="Lebesgue" "LeA§ons sur l'intA©gration et la recherche des fonctions ..."& 
pg=P A l#v=onepage&q=) , Gauthier- Villars . 

• B.M. Levitan (2001), "Hilbert space" (http://eom.springer.de/H/h047380.htm), in Hazewinkel, Michiel, 
Encyclopaedia of Mathematics, Springer, ISBN 978-1556080104. 

• Marsden, Jerrold E. (1974), Elementary classical analysis, W. H. Freeman and Co., MR0357693. 

• Odell, E.; Schlumprecht, Th. (1993), "The distortion problem of Hilbert space", Geom.Funct.Anal 3: 201-207, 
doi: 10. 1007/BF01 896023, MR1209302, ISSN 1016-443X. 

• Odell, E.; Schlumprecht, Th. (1994), "The distortion problem", Acta Mathematica 173: 259-281, 
doi:10.1007/BF02398436, MR1301394, ISSN 0001-5962. 

• Prugovecki, Eduard (1981), Quantum mechanics in Hilbert space (2nd ed.), Dover (published 2006), 
ISBN 978-0486453279. 

• Reed, Michael; Simon, Barry (1980), Functional Analysis, Methods of Modern Mathematical Physics, Academic 
Press, ISBN 0-12-585050-6. 

• Reed, Michael; Simon, Barry (1975), Fourier Analysis, Self-Adjointness, Methods of Modern Mathematical 
Physics, Academic Press, ISBN 0-12-5850002-6. 

• Riesz, Frigyes (1907), "Sur une espece de Geometrie analytique des systemes de fonctions sommables", C. R. 
Acad. Sci. Paris 144: 1409-1411. 

• Riesz, Frigyes (1934), "Zur Theorie des Hilbertschen Raumes", Acta Sci. Math. Szeged 7: 34-38. 

• Riesz, Frigyes; Sz.-Nagy, Bela (1990), Functional analysis, Dover, ISBN 0-486-66289-6. 

• Rudin, Walter (1973), Functional analysis, Tata MacGraw-Hill. 

• Rudin, Walter (1987), Real and Complex Analysis, McGraw-Hill, ISBN 0-07-100276-6. 

• Saks, Stanislaw (2005), Theory of the integral (2nd Dover ed.), Dover, ISBN 978-0486446486; originally 
published Monografje Matematyczne, vol. 7, Warszawa, 1937. 

• Schmidt, Erhard (1908), "Uber die Auflosung linearer Gleichungen mit unendlich vielen Unbekannten", Rend. 
Circ. Mat. Palermo 25: 63-77, doi:10.1007/BF03029116. 

• Shubin, M. A. (1987), Pseudo differential operators and spectral theory, Springer Series in Soviet Mathematics, 
Berlin, New York: Springer- Verlag, MR883081, ISBN 978-3-540-13621-7. 

• Sobrino, Luis (1996), Elements of non-relativistic quantum mechanics, River Edge, NJ: World Scientific 
Publishing Co. Inc., MR1626401, ISBN 9789810223861. 

• Stewart, James (2006), Calculus: Concepts and Contexts (3rd ed.), Thomson/Brooks/Cole. 



Hilbert space 



160 



• Stein, E (1970), Singular Integrals and Differentiability Properties of Functions,, Princeton Univ. Press, 
ISBN 0-691-08079-8. 

• Stein, Elias; Weiss, Guido (1971), Introduction to Fourier Analysis on Euclidean Spaces, Princeton, N.J.: 
Princeton University Press, ISBN 978-0-691-08078-9. 

• Streater, Ray; Wightman, Arthur (1964), PCT, Spin and Statistics and All That, W. A. Benjamin, Inc. 

• Titchmarsh, Edward Charles (1946), Eigenfunction expansions, part 1, Oxford University: Clarendon Press. 

• Treves, Francois (1967), Topological Vector Spaces, Distributions and Kernels, Academic Press. 

• von Neumann, John (1929), "Allgemeine Eigenwerttheorie Hermitescher Funktionaloperatoren", Mathematische 
Annalen 102: 49-131, doi:10.1007/BF01782338. 

• von Neumann, John (1932), "Physical Applications of the Ergodic Hypothesis" (http://www.jstor.org/stable/ 
86260), Proc Natl Acad Sci USA 18 (3): 263-266, doi: 10. 1073/pnas. 18.3.263, PMID 16587674, PMC 1076204. 

• von Neumann, John (1955), Mathematical foundations of quantum mechanics, Princeton Landmarks in 
Mathematics, Princeton University Press (published 1996), MR1435976, ISBN 978-0-691-02893-4. 

• Warner, Frank (1983), Foundations of Differ entiable Manifolds and Lie Groups, Berlin, New York: 
Springer- Verlag, ISBN 978-0-387-90894-6. 

• Weidmann, Joachim (1980), Linear operators in Hilbert spaces, Graduate Texts in Mathematics, 68, Berlin, New 
York: Springer- Verlag, MR566954, ISBN 978-0-387-90427-6. 

• Weyl, Hermann (1931), The Theory of Groups and Quantum Mechanics (English 1950 ed.), Dover Press, 
ISBN 0-486-60269-9. 

• Young, N (1988), An introduction to Hilbert space, Cambridge University Press, ISBN 0-521-33071-8. 

External links 

• Hilbert Space at Mathworld (http://mathworld.wolfram.com/HilbertSpace.html) 

• 245B, notes 5: Hilbert spaces (http://terrytao.wordpress.com/2009/01/17/254a-notes-5-hilbert-spaces/) by 
Terence Tao 



Von Neumann algebra 



161 



Von Neumann algebra 

In mathematics, a von Neumann algebra or W*-algebra is a *-algebra of bounded operators on a Hilbert space that 
is closed in the weak operator topology and contains the identity operator. They were originally introduced by John 
von Neumann, motivated by the study of single operators, group representations, ergodic theory and quantum 
mechanics. His double commutant theorem shows that the analytic definition is equivalent to a purely algebraic 
definition as an algebra of symmetries. 

Two basic examples of von Neumann algebras are as follows. The ring L°°(R) of essentially bounded measurable 

functions on the real line is a commutative von Neumann algebra, which acts by pointwise multiplication on the 

2 

Hilbert space L (R) of square integrable functions. The algebra B{H) of all bounded operators on a Hilbert space H is 
a von Neumann algebra, non-commutative if the Hilbert space has dimension at least 2. 

Von Neumann algebras were first studied by von Neumann (1929); he and Francis Murray developed the basic 
theory, under the original name of rings of operators, in a series of papers written in the 1930s and 1940s (F.J. 
Murray & J. von Neumann 1936, 1937, 1943; J. von Neumann 1938, 1940, 1943, 1949), reprinted in the collected 
works of von Neumann (1961). 

Introductory accounts of von Neumann algebras are given in the online notes of Jones (2003) and Wassermann 
(1991) and the books by Dixmier (1981), Schwartz (1967), Blackadar (2005) and Sakai (1971). The three volume 
work by Takesaki (1979) gives an encyclopedic account of the theory. The book by Connes (1994) discusses more 
advanced topics. 

Definitions 

There are three common ways to define von Neumann algebras. 

The first and most common way is to define them as weakly closed * algebras of bounded operators (on a Hilbert 
space) containing the identity. In this definition the weak (operator) topology can be replaced by many other 
common topologies including the strong, ultrastrong or ultraweak operator topologies. The *-algebras of bounded 
operators that are closed in the norm topology are C*-algebras, so in particular any von Neumann algebra is a 
C*-algebra. 

The second definition is that a von Neumann algebra is a subset of the bounded operators closed under * and equal to 
its double commutant, or equivalently the commutant of some subset closed under *. The von Neumann double 
commutant theorem (von Neumann 1929) says that the first two definitions are equivalent. 

The first two definitions describe a von Neumann algebras concretely as a set of operators acting on some given 
Hilbert space. Sakai (1971) showed that von Neumann algebras can also be defined abstractly as C*-algebras that 
have a predual; in other words the von Neumann algebra, considered as a Banach space, is the dual of some other 
Banach space called the predual. The predual of a von Neumann algebra is in fact unique up to isomorphism. Some 
authors use "von Neumann algebra" for the algebras together with a Hilbert space action, and "W*-algebra" for the 
abstract concept, so a von Neumann algebra is a W*-algebra together with a Hilbert space and a suitable faithful 
unital action on the Hilbert space. The concrete and abstract definitions of a von Neumann algebra are similar to the 
concrete and abstract definitions of a C*-algebra, which can be defined either as norm-closed * algebras of operators 
on a Hilbert space, or as Banach *-algebras such that \\a a*\\=\\a\\ 



Von Neumann algebra 



162 



Terminology 

Some of the terminology in von Neumann algebra theory can be confusing, and the terms often have different 
meanings outside the subject. 

• A factor is a von Neumann algebra with trivial center, i.e. a center consisting only of scalar operators. 

• A finite von Neumann algebra is one which is the direct integral of finite factors. Similarly, properly infinite von 
Neumann algebras are the direct integral of properly infinite factors. 

• A von Neumann algebra that acts on a separable Hilbert space is called separable. Note that such algebras are 
rarely separable in the norm topology. 

• The von Neumann algebra generated by a set of bounded operators on a Hilbert space is the smallest von 
Neumann algebra containing all those operators. 

• The tensor product of two von Neumann algebras acting on two Hilbert spaces is defined to be the von 
Neumann algebra generated by their algebraic tensor product, considered as operators on the Hilbert space tensor 
product of the Hilbert spaces. 

By forgetting about the topology on a von Neumann algebra, we can consider it a (unital) *-algebra, or just a ring. 
Von Neumann algebras are semihereditary: every finitely generated submodule of a projective module is itself 
projective. There have been several attempts to axiomatize the underlying rings of von Neumann algebras, including 
Baer *-rings and AW* algebras. The *-algebra of affiliated operators of a finite von Neumann algebra is a von 
Neumann regular ring. (The von Neumann algebra itself is in general not von Neumann regular.) 

Commutative von Neumann algebras 

Main article: Abelian von Neumann algebra 

The relationship between commutative von Neumann algebras and measure spaces is analogous to that between 
commutative C* -algebras and locally compact Hausdorff spaces. Every commutative von Neumann algebra is 
isomorphic to L°°(X) for some measure space (X, \i) and conversely, for every a-finite measure space X, the * algebra 
L°°(X) is a von Neumann algebra. 

Due to this analogy, the theory of von Neumann algebras has been called noncommutative measure theory, while the 
theory of C*-algebras is sometimes called noncommutative topology (Connes 1994). 

Projections 

Operators E in a von Neumann algebra for which E = EE = E* are called projections; they are exactly the operators 
which give an orthogonal projection of H onto some closed subspace. A subspace of the Hilbert space H is said to 
belong to the von Neumann algebra M if it is the image of some projection in M. Informally these are the closed 
subspaces that can be described using elements of M, or that M "knows" about. The closure of the image of any 
operator in M, or the kernel of any operator in M belong to M, and the closure of the image of any subspace 
belonging to M under an operator of M also belongs to M. There is a 1:1 correspondence between projections of M 
and subspaces that belong to it. 

The basic theory of projections was worked out by Murray & von Neumann (1936). Two subspaces belonging to M 
are called (Murray-von Neumann) equivalent if there is a partial isometry mapping the first isomorphically onto 
the other that is an element of the von Neumann algebra (informally, if M "knows" that the subspaces are 
isomorphic). This induces a natural equivalence relation on projections by defining E to be equivalent to F if the 
corresponding subspaces are equivalent, or in other words if there is a partial isometry of H that maps the image of E 
isometrically to the image of F and is an element of the von Neumann algebra. Another way of stating this is that E 
is equivalent to F if E=uu* and F-uu for some partial isometry u in M. 

The equivalence relation ~ thus defined is additive in the following sense: Suppose E^ ~ F^ and E^~ F^. If E^ J_ 
and F J_ F then E + E ~ F + F . This is not true in general if one requires unitary equivalence in the definition of 



Von Neumann algebra 



163 



~, i.e. if we say E is equivalent to F if u*Eu = F for some unitary u. . 

The subspaces belonging to M are partially ordered by inclusion, and this induces a partial order < of projections. 
There is also a natural partial order on the set of equivalence classes of projections, induced by the partial order < of 
projections. If M is a factor, < is a total order on equivalence classes of projections, described in the section on traces 
below. 

A projection (or subspace belonging to M) E is said to he finite if there is no projection F < E that is equivalent to E. 
For example, all finite-dimensional projections (or subspaces) are finite (since isometries between Hilbert spaces 
leave the dimension fixed), but the identity operator on an infinite-dimensional Hilbert space is not finite in the von 
Neumann algebra of all bounded operators on it, since it is isometrically isomorphic to a proper subset of itself. 
However it is possible for infinite dimensional subspaces to be finite. 

Orthogonal projections are noncommutative analogues of indicator functions in L°°(R). L°°(R) is the ll ll^-closure of 
the subspace generated by the indicator functions. Similarly, a von Neumann algebra is generated by its projections; 
this is a consequence of the spectral theorem for self-adjoint operators. 

Factors 

A von Neumann algebra N whose center consists only of multiples of the identity operator is called a factor, von 
Neumann (1949) showed that every von Neumann algebra on a separable Hilbert space is isomorphic to a direct 
integral of factors. This decomposition is essentially unique. Thus, the problem of classifying isomorphism classes of 
von Neumann algebras on separable Hilbert spaces can be reduced to that of classifying isomorphism classes of 
factors. 

Murray & von Neumann (1936) showed that every factor has one of 3 types as described below. The type 
classification can be extended to von Neumann algebras that are not factors, and a von Neumann algebra is of type X 
if it can be decomposed as a direct integral of type X factors; for example, every commutative von Neumann algebra 
has type I . Every von Neumann algebra can be written uniquely as a sum of von Neumann algebras of types I, II, 
and III. 

There are several other ways to divide factors into classes that are sometimes used: 

• A factor is called discrete (or occasionally tame) if it has type I, and continuous (or occasionally wild) if it has 
type II or III. 

• A factor is called semifinite if it has type I or II, and purely infinite if it has type III. 

• A factor is called finite if the projection 1 is finite and properly infinite otherwise. Factors of types I and II may 
be either finite or properly infinite, but factors of type III are always properly infinite. 

Type I factors 

A factor is said to be of type I if there is a minimal projection E # 0, i.e. a projection E such that there is no other 
projection F with 0 < F < E. Any factor of type I is isomorphic to the von Neumann algebra of all bounded operators 
on some Hilbert space; since there is one Hilbert space for every cardinal number, isomorphism classes of factors of 
type I correspond exactly to the cardinal numbers. Since many authors consider von Neumann algebras only on 
separable Hilbert spaces, it is customary to call the bounded operators on a Hilbert space of finite dimension n a 
factor of type I , and the bounded operators on a separable infinite-dimensional Hilbert space, a factor of type I . 



Von Neumann algebra 



164 



Type II factors 

A factor is said to be of type II if there are no minimal projections but there are non-zero finite projections. This 
implies that every projection E can be halved in the sense that there are equivalent projections F and G such that E = 
F + G. If the identity operator in a type II factor is finite, the factor is said to be of type 11^ otherwise, it is said to be 
of type 11^. The best understood factors of type II are the hyperfinite type II factor and the hyperfinite type 11^ 
factor, found by Murray & von Neumann (1936). These are the unique hyperfinite factors of types II 1 and IIj there 
are an uncountable number of other factors of these types that are the subject of intensive study. Murray & von 
Neumann (1937) proved the fundamental result that a factor of type II 1 has a unique finite tracial state, and the set of 
traces of projections is [0,1]. 

A factor of type II has a semifinite trace, unique up to rescaling, and the set of traces of projections is [0,°o]. The set 
of real numbers X such that there is an automorphism rescaling the trace by a factor of X is called the fundamental 
group of the type II factor. 

The tensor product of a factor of type II and an infinite type I factor has type 11^, and conversely any factor of type 
11^ can be constructed like this. The fundamental group of a type II 1 factor is defined to be the fundamental group 
of its tensor product with the infinite (separable) factor of type I. For many years it was an open problem to find a 
type II factor whose fundamental group was not the group of all positive reals, but Connes then showed that the von 
Neumann group algebra of a countable discrete group with Kazhdan's property T (the trivial representation is 
isolated in the dual space), such as SL (Z), has a countable fundamental group. Subsequently Sorin Popa showed 
that the fundamental group can be trivial for certain groups, including the semidirect product of Z by SL 2 (Z). 

An example of a type II factor is the von Neumann group algebra of a countable infinite discrete group such that 
every non-trivial conjugacy class is infinite. McDuff (1969) found an uncountable family of such groups with 
non-isomorphic von Neumann group algebras, thus showing the existence of uncountably many different separable 
type II 1 factors. 

Type III factors 

Lastly, type III factors are factors that do not contain any nonzero finite projections at all. In their first paper Murray 
& von Neumann (1936) were unable decide whether or not they existed; the first examples were later found by von 
Neumann (1940). Since the identity operator is always infinite in those factors, they were sometimes called type III^ 
in the past, but recently that notation has been superseded by the notation III., where X is a real number in the 
interval [0,1]. More precisely, if the Connes spectrum (of its modular group) is 1 then the factor is of type III 0 , if the 
Connes spectrum is all integral powers of X for 0 < X < 1 , then the type is III , and if the Connes spectrum is all 
positive reals then the type is III^ (The Connes spectrum is a closed subgroup of the positive reals, so these are the 
only possibilities.) The only trace on type III factors takes value °o on all non-zero positive elements, and any two 
non-zero projections are equivalent. At one time type III factors were considered to be intractable objects, but 
Tomita-Takesaki theory has led to a good structure theory. In particular, any type III factor can be written in a 
canonical way as the crossed product of a type II factor and the real numbers. 



Von Neumann algebra 



165 



The predual 

Any von Neumann algebra M has a predual M^, which is the Banach space of all ultraweakly continuous linear 
functionals on M. As the name suggests, Mis (as a Banach space) the dual of its predual. The predual is unique in the 
sense that any other Banach space whose dual is M is canonically isomorphic to M^. Sakai (1971) showed that the 
existence of a predual characterizes von Neumann algebras among C* algebras. 

The definition of the predual given above seems to depend on the choice of Hilbert space that M acts on, as this 
determines the ultraweak topology. However the predual can also be defined without using the Hilbert space that M 
acts on, by defining it to be the space generated by all positive normal linear functionals on M. (Here "normal" 
means that it preserves suprema when applied to increasing nets of self adjoint operators; or equivalently to 
increasing sequences of projections.) 

The predual is a closed subspace of the dual M (which consists of all norm-continuous linear functionals on M) 
but is generally smaller. The proof that is (usually) not the same as M is nonconstructive and uses the axiom of 
choice in an essential way; it is very hard to exhibit explicit elements of M that are not in M^. For example, exotic 
positive linear forms on the von Neumann algebra f° (Z) are given by free ultrafilters; they correspond to exotic 
*-homomorphisms into C and describe the Stone-Cech compactification of Z. 

Examples: 

1. The predual of the von Neumann algebra L°°(R) of essentially bounded functions on R is the Banach space L l (R) 
of integrable functions. The dual of L°°(R) is strictly larger than L*(R) For example, a functional on L°°(R) that 
extends the Dirac measure 6 Q on the closed subspace of bounded continuous functions C° b (R) cannot be 
represented as a function in L l (R). 

2. The predual of the von Neumann algebra B(H) of bounded operators on a Hilbert space H is the Banach space of 
all trace class operators with the trace norm IIAII= Tr(IAI). The Banach space of trace class operators is itself the 
dual of the C*-algebra of compact operators (which is not a von Neumann algebra). 

Weights, states, and traces 

Weights and their special cases states and traces are discussed in detail in (Takesaki 1979). 

• A weight oo on a von Neumann algebra is a linear map from the set of positive elements (those of the form a a) to 

[0,oo]. 

• A positive linear functional is a weight with co(l) finite (or rather the extension of oo to the whole algebra by 
linearity). 

• A state is a weight with co(l)=l. 

• A trace is a weight with oo(aa )=oo(a a) for all a. 

• A tracial state is a trace with co(l)=l. 

Any factor has a trace such that the trace of a non-zero projection is non-zero and the trace of a projection is infinite 
if and only if the projection is infinite. Such a trace is unique up to rescaling. For factors that are separable or finite, 
two projections are equivalent if and only if they have the same trace. The type of a factor can be read off from the 
possible values of this trace as follows: 

• Type 1^: 0, x, 2x, ....,nx for some positive x (usually normalized to be l/n or 1). 

• Type I : 0, x, 2x, ....,©<> for some positive x (usually normalized to be 1). 

• Type 11^ [0,x] for some positive x (usually normalized to be 1). 

• Type llj [0,oo]. 

• Type III: 0,«>. 

If a von Neumann algebra acts on a Hilbert space containing a norm 1 vector v, then the functional a —> (av,v) is a 
normal state. This construction can be reversed to give an action on a Hilbert space from a normal state: this is the 



Von Neumann algebra 



166 



GNS construction for normal states. 

Modules over a factor 

Given an abstract separable factor, one can ask for a classification of its modules, meaning the separable Hilbert 
spaces that it acts on. The answer is given as follows: every such module H can be given an M-dimension dim^(7f) 
(not its dimension as a complex vector space) such that modules are isomorphic if and only if they have the same 
M-dimension. The M-dimension is additive, and a module is isomorphic to a subspace of another module if and only 
if it has smaller or equal M-dimension. 

A module is called standard if it has a cyclic separating vector. Each factor has a standard representation, which is 
unique up to isomorphism. The standard representation has an antilinear involution / such that JMJ = M'. For finite 
factors the standard module is given by the GNS construction applied to the unique normal tracial state and the 
M-dimension is normalized so that the standard module has M-dimension 1, while for infinite factors the standard 
module is the module with M-dimension equal to ©o. 

The possible M-dimensions of modules are given as follows: 

• Type I (n finite): The M-dimension can be any of 0/n, 1/n, 21 n, 3/n, <*>. The standard module has M-dimension 

n ^ 
1 (and complex dimension n .) 

• Type I The M-dimension can be any of 0, 1, 2, 3, ©o. The standard representation of B(H) is H®H; its 
M-dimension is °o. 

• Type 11^: The M-dimension can be anything in [0, «>]. It is normalized so that the standard module has 
M-dimension 1 . The M-dimension is also called the coupling constant of the module H. 

• Type II Q : The M-dimension can be anything in [0, °o]. There is in general no canonical way to normalize it; the 
factor may have outer automorphisms multiplying the M-dimension by constants. The standard representation is 
the one with M-dimension ©o. 

• Type III: The M-dimension can be 0 or ©©. Any two non-zero modules are isomorphic, and all non-zero modules 
are standard. 

Amenable von Neumann algebras 

Connes (1976) and others proved that the following conditions on a von Neumann algebra M on a separable Hilbert 
space H are all equivalent: 

• M is hyperfinite or AFD or approximately finite dimensional or approximately finite: this means the algebra 
contains an ascending sequence of finite dimensional subalgebras with dense union. (Warning: some authors use 
"hyperfinite" to mean "AFD and finite".) 

• M is amenable: this means that the derivations of M with values in a normal dual Banach bimodule are all inner. 

• M has Schwartz's property P: for any bounded operator TonH the weak operator closed convex hull of the 
elements uTu contains an element commuting with M. 

• M is semidiscrete: this means the identity map from M to M is a weak pointwise limit of completely positive 
maps of finite rank. 

• M has property E or the Hakeda-Tomiyama extension property: this means that there is a projection of norm 1 
from bounded operators on H to M '. 

• M is injective: any completely positive linear map from any self adjoint closed subspace containing 1 of any 
unital C -algebra A to M can be extended to a completely positive map from A to M. 

There is no generally accepted term for the class of algebras above; Connes has suggested that amenable should be 
the standard term. 



Von Neumann algebra 



167 



The amenable factors have been classified: there is a unique one of each of the types 1,1 , IL , II , III , for 0<k< 1, 

A J ^ n oo 1 oo a, 

and the ones of type III 0 correspond to certain ergodic flows. (For type III 0 calling this a classification is a little 
misleading, as it is known that there is no easy way to classify the corresponding ergodic flows.) The ones of type I 
and II 1 were classified by Murray & von Neumann (1943), and the remaining ones were classified by Connes (1976), 
except for the type III case which was completed by Haagerup. 

All amenable factors can be constructed using the group-measure space construction of Murray and von Neumann 
for a single ergodic transformation. In fact they are precisely the factors arising as crossed products by free ergodic 
actions of Z or on abelian von Neumann algebras L°°(X). Type I factors occur when the measure space X is atomic 
and the action transitive. When X is diffuse or non-atomic, it is equivalent to [0,1] as a measure space. Type II 
factors occur when X admits an equivalent finite (II ) or infinite (11^ ) measure, invariant under Z • Type III factors 
occur in the remaining cases where there is no invariant measure, but only an invariant measure class: these factors 
are called Krieger factors. 



Tensor products of von Neumann algebras 

The Hilbert space tensor product of two Hilbert spaces is the completion of their algebraic tensor product. One can 
define a tensor product of von Neumann algebras (a completion of the algebraic tensor product of the algebras 
considered as rings), which is again a von Neumann algebra, and act on the tensor product of the corresponding 
Hilbert spaces. The tensor product of two finite algebras is finite, and the tensor product of an infinite algebra and a 
non-zero algebra is infinite. The type of the tensor product of two von Neumann algebras (I, II, or III) is the 
maximum of their types. The commutation theorem for tensor products states that 

(M ® N)' = M'®N' 
(where M' denotes the commutant of M). 

The tensor product of an infinite number of von Neumann algebras, if done naively, is usually a ridiculously large 
non-separable algebra. Instead von Neumann (1938) showed that one should choose a state on each of the von 
Neumann algebras, use this to define a state on the algebraic tensor product, which can be used to product a Hilbert 
space and a (reasonably small) von Neumann algebra. Araki & Woods (1968) studied the case where all the factors 
are finite matrix algebras; these factors are called Araki- Woods factors or ITPFI factors (ITPFI stands for "infinite 
tensor product of finite type I factors"). The type of the infinite tensor product can vary dramatically as the states are 
changed; for example, the infinite tensor product of an infinite number of type \^ factors can have any type 
depending on the choice of states. In particular Powers (1967) found an uncountable family of non-isomorphic 
hyperfinite type III factors for 0<X<1, called Powers factors, by taking an infinite tensor product of type I factors, 

each with the state given by : x \— > Tr ( ^Tlr 1 ^ J x. 

V 0 A+l/ 

All hyperfinite von Neumann algebras not of type are isomorphic to Araki- Woods factors, but there are 
uncountably many of type III 0 that are not. 

Bimodules and subfactors 

A bimodule (or correspondence) is a Hilbert space H with module actions of two commuting von Neumann 
algebras. Bimodules have a much richer structure than that of modules. Any bimodule over two factors always gives 
a subfactor since one of the factors is always contained in the commutant of the other. There is also a subtle relative 
tensor product operation due to Connes on bimodules. The theory of subfactors, initiated by Vaughan Jones, 
reconciles these two seemingly different points of view. 

Bimodules are also important for the von Neumann group algebra M of a discrete group P . Indeed if V is any 

unitary representation of T, then, regarding fas the diagonal subgroup of Px T, the corresponding induced 

2 

representation on / ( F,V) is naturally a bimodule for two commuting copies of M. Important representation 



Von Neumann algebra 



168 



theoretic properties of Tcan be formulated entirely in terms of bimodules and therefore make sense for the von 
Neumann algebra itself. For example Connes and Jones gave a definition of an analogue of Kazhdan's Property T for 
von Neumann algebras in this way. 

Non-amenable factors 

Von Neumann algebras of type I are always amenable, but for the other types there are an uncountable number of 
different non-amenable factors, which seem very hard to classify, or even distinguish from each other. Nevertheless 
Voiculescu has shown that the class of non-amenable factors coming from the group-measure space construction is 
disjoint from the class coming from group von Neumann algebras of free groups. Later Narutaka Ozawa proved that 
group von Neumann algebras of hyperbolic groups yield prime type II factors, i.e. ones that cannot be factored as 
tensor products of type II 1 factors, a result first proved by Leeming Ge for free group factors using Voiculescu' s free 
entropy. Popa's work on fundamental groups of non-amenable factors represents another significant advance. The 
theory of factors "beyond the hyperfinite" is rapidly expanding at present, with many new and surprising results; it 
has close links with rigidity phenomena in geometric group theory and ergodic theory. 

Examples 

• The essentially bounded functions on a a-finite measure space form a commutative (type I ) von Neumann 

2 

algebra acting on the L functions. For certain non-a-finite measure spaces, usually considered pathological, 
L°°(X) is not a von Neumann algebra; for example, the a-algebra of measurable sets might be the 
countable-cocountable algebra on an uncountable set. 

• The bounded operators on any Hilbert space form a von Neumann algebra, indeed a factor, of type I. 

• If we have any unitary representation of a group G on a Hilbert space H then the bounded operators commuting 
with G form a von Neumann algebra G', whose projections correspond exactly to the closed subspaces of H 
invariant under G. Equivalent subrepresentations correspond to equivalent projections in G'. The double 
commutant G" of G is also a von Neumann algebra. 

2 

• The von Neumann group algebra of a discrete group G is the algebra of all bounded operators onH=l (G) 
commuting with the action of G on H through right multiplication. One can show that this is the von Neumann 
algebra generated by the operators corresponding to multiplication from the left with an element g G G. It is a 
factor (of type 11^ if every non-trivial conjugacy class of G is infinite (for example, a non-abelian free group), 
and is the hyperfinite factor of type II if in addition G is a union of finite subgroups (for example, the group of all 
permutations of the integers fixing all but a finite number of elements). 

• The tensor product of two von Neumann algebras, or of a countable number with states, is a von Neumann 
algebra as described in the section above. 

• The crossed product of a von Neumann algebra by a discrete (or more generally locally compact) group can be 
defined, and is a von Neumann algebra. Special cases are the group-measure space construction of Murray and 
von Neumann and Krieger factors. 

• The von Neumann algebras of a measurable equivalence relation and a measurable groupoid can be defined. 
These examples generalise von Neumann group algebras and the group-measure space construction. 



Von Neumann algebra 



169 



Applications 

Von Neumann algebras have found applications in diverse areas of mathematics like knot theory, statistical 
mechanics, Quantum field theory, Local quantum physics, Free probability, Noncommutative geometry, 
representation theory, geometry, and probability. 

References 

• Araki, H.; Woods, E. J. (1968), "A classification of factors", Publ Res. Inst. Math. Sci. Ser. A 4: 51-130, 
doi: 10.2977/prims/l 195 195263MR0244773 

• Blackadar, B. (2005), Operator algebras, Springer, ISBN 3-540-28486-9 

• Connes, A. (1976), "Classification of Injective Factors" ^\ The Annals of Mathematics 2nd Ser. 104 (1): 73-115, 
doi: 10.2307/1971057 

• Connes, A. (1994), Non-commutative geometry [2] , Academic Press, ISBN 0-12-185860-X. 

• Dixmier, J. (1981), Von Neumann algebras, ISBN 0-444-86308-7 (A translation of Dixmier, J. (1957), Les 
algebres d'operateurs dans Vespace hilbertien: algebres de von Neumann, Gauthier-Villars, the first book about 
von Neumann algebras.) 

• Jones, V.F.R. (2003), von Neumann algebras ; incomplete notes from a course. 

• McDuff, Dusa (1969), "Uncountably many n factors" [4] , Ann of Math. (Annals of Mathematics) 90 (2): 
372-377, doi: 10.2307/1970730 

• Murray, F. J., "The rings of operators papers", The legacy of John von Neumann (Hempstead, NY, 1988), Proc. 
Sympos. Pure Math., 50, Providence, RL: Amer. Math. Soc, pp. 57-60, ISBN 0-8218-4219-6 A historical 
account of the discovery of von Neumann algebras. 

• Murray, F.J.; von Neumann, J. (1936), "On rings of operators" ^, Ann. Of Math. (2) (Annals of Mathematics) 37 
(1): 116-229, doi: 10.2307/1968693. This paper gives their basic properties and the division into types I, II, and 
III, and in particular finds factors not of type I. 

• Murray, F.J.; von Neumann, J. (1937), "On rings of operators II" Trans. Amer. Math. Soc. (American 
Mathematical Society) 41 (2): 208-248, doi: 10.2307/1989620. This is a continuation of the previous paper, that 
studies properties of the trace of a factor. 

• Murray, F.J.; von Neumann, J. (1943), "On rings of operators IV" , Ann. Of Math. (2) (Annals of Mathematics) 
44 (4): 716-808, doi: 10.2307/1969107. This studies when factors are isomorphic, and in particular shows that all 
approximately finite factors of type II are isomorphic. 

• Powers, Robert T. (1967), "Representations of Uniformly Hyperfinite Algebras and Their Associated von 
Neumann Rings" [8] , The Annals of Mathematics, 2nd Ser. 86 (1): 138-171, doi: 10.2307/1970364 

• Sakai, S. (1971), C*-algebras and W*-algebras, Springer, ISBN 3-540-63633-1 

• Schwartz, Jacob (1967), W-* Algebras, ISBN 0-677-00670-5 

rm 

• Shtern, A.I. (2001), "von Neumann algebra" , in Hazewinkel, Michiel, Encyclopaedia of Mathematics, 
Springer, ISBN 978-1556080104 

• Takesaki, M. (1979), Theory of Operator Algebras I, II, III, ISBN 3-540-42248-X ISBN 3-540-429 14-X ISBN 
3-540-42913-1 

• von Neumann, J. (1929), "Zur Algebra der Funktionaloperationen und Theorie der normalen Operatoren", Math. 
Ann. 102: 370-427, doi:10.1007/BF01782352. The original paper on von Neumann algebras. 

• von Neumann, J. (1936), "On a Certain Topology for Rings of Operators" The Annals of Mathematics 2nd 
Ser. 37 (1): 111-115, doi: 10.2307/1968692. This defines the ultrastrong topology. 

• von Neumann, J. (1938), "On infinite direct products" Compos. Math. 6: 1-77. This discusses infinite tensor 
products of Hilbert spaces and the algebras acting on them. 

• von Neumann, J. (1940), "On rings of operators III" [12] , Ann. Of Math. (2) (Annals of Mathematics) 41 (1): 
94-161, doi: 10.2307/1968823. This shows the existence of factors of type III. 



Von Neumann algebra 



170 



• von Neumann, J. (1943), "On Some Algebraical Properties of Operator Rings" , The Annals of Mathematics 
2nd Ser. 44 (4): 709-715, doi: 10.2307/1969106. This shows that some apparently topological properties in von 
Neumann algebras can be defined purely algebraically. 

• von Neumann, J. (1949), "On Rings of Operators. Reduction Theory" The Annals of Mathematics 2nd Ser. 50 
(2): 401-485, doi: 10.2307/1969463. This discusses how to write a von Neumann algebra as a sum or integral of 
factors. 

• von Neumann, John (1961), Taub, A.H., ed., Collected Works, Volume III: Rings of Operators, NY: Pergamon 
Press. Reprints von Neumann's papers on von Neumann algebras. 

• Wassermann, A. J. (1991), Operators on Hilbert space ^ 



References 

[I] http://linksjstor.org/sici?sici=0003-486X%28197607%292%3A104%3Al%3C73%3ACOIFC%3E2.0.CO%^ 
[2] http : / / www . alainconnes . org/ doc s/book94bigpdf . pdf 

[3] http://www.math.berkeley.edu/~vfr/MATH20909/VonNeumann2009.pdf 

[4] http://links.jstor.org/sici?sici=0003-486X%28196909%292%3A90%3A2%3C372%3AUMIF%3E2.0.CO%3B2-C 
[5] http://links.jstor.org/sici?sici=0003-486X%28193601%292%3A37%3Al%3C116%3AOROO%3E2.0.CO%3B2-Y 
[6] http://links.jstor.org/sici?sici=0002-9947%28193703%2941%3A2%3C208%3AOROOI%3E2.0.CO%3B2-9 
[7] http://links.jstor.org/sici?sici=0003-486X%28194310%292%3A44%3A4%3C716%3AOROOI%3E2.0.CO%3B2-O 
[8] http://links.jstor.org/sici?sici=0003-486X%28196707%292%3A86%3Al%3C138%3AROUHAA%3E2.0.CO%3B2-6 
[9] http://eom.springer.de/V/v096900.htm 

[10] http://links.jstor.org/sici?sici=0003-486X%28193601%292%3A37%3Al%3Clll%3AOACTFR%3E2.0.TO 

[II] http://www.numdam.org/item?id=CM_1939__6__l_0 

[12] http://links.jstor.org/sici?sici=0003-486X%28194001%292%3A41%3Al%3C94%3AOROOI%3E2.0.CO%3B2-U 
[13] http://links.jstor.org/sici?sici=0003-486X%28194310%292%3A44%3A4%3C709%3AOSAPOO%3E2.0.CO%3B2-L 
[14] http://links.jstor.org/sici?sici=0003-486X%28194904%292%3A50%3A2%3C401%3AOROORT%3E2.0.CO%3B2-H 
[15] http :// iml. univ-mrs . fir/ ~ wasserm/ OHS . ps 



C*-algebra 



171 



C*-algebra 

C*-algebras (pronounced "C-star") are an important area of research in functional analysis, a branch of 
mathematics. The prototypical example of a C*-algebra is a complex algebra A of linear operators on a complex 
Hilbert space with two additional properties: 

• A is a topologically closed set in the norm topology of operators. 

• A is closed under the operation of taking adjoints of operators. 

It is generally believed that C*-algebras were first considered primarily for their use in quantum mechanics to model 
algebras of physical observables. This line of research began with Werner Heisenberg's matrix mechanics and in a 
more mathematically developed form with Pascual Jordan around 1933. Subsequently John von Neumann attempted 
to establish a general framework for these algebras which culminated in a series of papers on rings of operators. 
These papers considered a special class of C*-algebras which are now known as von Neumann algebras. 

Around 1943, the work of Israel Gelfand and Mark Naimark yielded an abstract characterisation of C*-algebras 
making no reference to operators. 

C*-algebras are now an important tool in the theory of unitary representations of locally compact groups, and are 
also used in algebraic formulations of quantum mechanics. 

Abstract characterization 

We begin with the abstract characterization of C*-algebras given in the 1943 paper by Gelfand and Naimark. 

A C*-algebra, A, is a Banach algebra over the field of complex numbers, together with a map, * : A — » A, called an 
involution. The image of an element x of A under the involution is written x*. Involution has the following 
properties: 

• For all x, y in A: 

(x + yY = x* + y* 

(xy)* = y*x* 

• For every X in C and every x in A: 

(\x)* = Ax*. 

• For all x in A 

{x*y = x 

• The C*-identity holds for all x in A: 

||x*x|| = ||x||||x*||. 
Note that the C* identity is equivalent to: for all x in A: 

\\xx*\\ = ||x||||x*||. 

This relation is equivalent to ||xx*|| = ||x|| 2 , which is sometimes called the B*-identity. For history behind the 
names C*- and B*-algebras, see the history section below. 

The C* -identity is a very strong requirement. For instance, together with the spectral radius formula, it implies the 
C*-norm is uniquely determined by the algebraic structure: 

||x|| 2 = ||x*x|| = sup{|A| : x*x — A 1 is not invertible}. 
A bounded linear map, it : A — » B, between C*-algebras A and B is called a *-homomorphism if 

• For x and y in A 

7r(xy) = 7r(x)7r(y) 



C*-algebra 



172 



• For x in A 

<jr{x*) = 7t(x)* 

In the case of C*-algebras, any *-homomorphism it between C*-algebras is non-expansive, i.e. bounded with norm < 
1. Furthermore, an injective *-homomorphism between C*-algebras is isometric. These are consequences of the 
C* -identity. 

A bijective *-homomorphism jt is called a C* -isomorphism, in which case A and B are said to be isomorphic. 

Some history: B*-algebras and C*-algebras 

The term B*-algebra was introduced by C. E. Rickart in 1946 to describe Banach *-algebras that satisfy the 
condition 

2 

• IIjc jc*II = IIjcII for all x in the given B* -algebra. (B* -condition) 

This condition automatically implies that the ^-involution is isometric, that is, = IIjcII. Hence \\x x*ll = llxll ILx*ll, 
and therefore, a B* -algebra is a C* -algebra. Conversely, the C* -condition implies the B* -condition. This is 
nontrivial, and can be proved without using the condition llxll=llx*ll. (For details, see R. S. Doran, V. A. Belfi, 
Characterizations of C*- Algebras — the Gelfand-Naimark Theorems, CRC, 1986.) 

For these reasons, the term B*-algebra is rarely used in current terminology, and has been replaced by the term 'C* 
algebra'. 

The term C*-algebra was introduced by I. E. Segal in 1947 to describe norm-closed subalgebras of B(H), namely, 
the space of bounded operators on some Hilbert space H. 'C stood for 'closed'. 

Examples 

Finite-dimensional C* -algebras 

The algebra M n (C) of n-by-n matrices over C becomes a C* -algebra if we consider matrices as operators on the 
Euclidean space, C n , and use the operator norm 11.11 on matrices. The involution is given by the conjugate transpose. 
More generally, one can consider finite direct sums of matrix algebras. In fact, all finite dimensional C*-algebras are 
of this form. The self-adjoint requirement means finite-dimensional C*-algebras are semisimple, from which fact 
one can deduce the following theorem of Artin-Wedderburn type: 

Theorem. A finite-dimensional C*-algebra, A, is canonically isomorphic to a finite direct sum 
A= 0 Ae 

eEmin A 

where min A is the set of minimal nonzero self-adjoint central projections of A. 

Each C*-algebra, Ae, is isomorphic (in a noncanonical way) to the full matrix algebra M dim ( e )(Q- The finite family 
indexed on min A given by {dim(e)} e is called the dimension vector of A. This vector uniquely determines the 
isomorphism class of a finite-dimensional C*-algebra. 



C*-algebra 



173 



C* -algebras of operators 

The prototypical example of a C*-algebra is the algebra B(H) of bounded (equivalently continuous) linear operators 
defined on a complex Hilbert space H; here x* denotes the adjoint operator of the operator x : H — » H. In fact, every 
C*-algebra, A, is * -isomorphic to a norm-closed adjoint closed subalgebra of B(H) for a suitable Hilbert space, H\ 
this is the content of the Gelfand-Naimark theorem. 

Commutative C* -algebras 

Let X be a locally compact Hausdorff space. The space C Q (X) of complex- valued continuous functions on X that 
vanish at infinity (defined in the article on local compactness) form a commutative C*-algebra C (X) under 
pointwise multiplication and addition. The involution is pointwise conjugation. C Q (X) has a multiplicative unit 
element if and only if X is compact. As does any C*-algebra, C (X) has an approximate identity. In the case of C (X) 
this is immediate: consider the directed set of compact subsets of X, and for each compact K let f be a function of 
compact support which is identically 1 on K. Such functions exist by the Tietze extension theorem which applies to 
locally compact Hausdorff spaces, {flis an approximate identity. 

K K 

The Gelfand representation states that every commutative C*-algebra is ^-isomorphic to the algebra C Q (X), where X 
is the space of characters equipped with the weak* topology. Furthermore if C (X) is isomorphic to C (Y) as 
C*-algebras, it follows that X and Y are homeomorphic. This characterization is one of the motivations for the 
noncommutative topology and noncommutative geometry programs. 

C*-algebras of compact operators 

Let H be a separable infinite-dimensional Hilbert space. The algebra K(H) of compact operators on H is a norm 
closed subalgebra of B(H). It is also closed under involution; hence it is a C*-algebra. 

Concrete C*-algebras of compact operators admit a characterization similar to Wedderburn's theorem for finite 
dimensional C*-algebras. 

Theorem. If A is a C* -subalgebra of K(H), then there exists Hilbert spaces {H.} . e l such that A is isomorphic to the 
following direct sum 

iei 

where the (C*-)direct sum consists of elements (7\) of the Cartesian product n K(H) with 117.11 — » 0. 

Though K(H) does not have an identity element, a sequential approximate identity for K(H) can be easily displayed. 
To be specific, H is isomorphic to the space of square summable sequences / ; we may assume that 

H = e. 

2 

For each natural number n let be the subspace of sequences of / which vanish for indices 
k > n 

and let 

be the orthogonal projection onto H^. The sequence {e^t n is an approximate identity for K(H). 

K(H) is a two-sided closed ideal of B(H). For separable Hilbert spaces, it is the unique ideal. The quotient of B(H) by 
K(H) is the Calkin algebra. 



C*-algebra 



174 



C* -enveloping algebra 

Given a B*-algebra A with an approximate identity, there is a unique (up to C* -isomorphism) C*-algebra E(A) and 
*-morphism jt from A into E(A) which is universal, that is, every other B*-morphism jt ' : A — » 5 factors uniquely 
through jt. The algebra E(A) is called the C*-enveloping algebra of the B*-algebra A. 

Of particular importance is the C*-algebra of a locally compact group G. This is defined as the enveloping 
C*-algebra of the group algebra of G. The C*-algebra of G provides context for general harmonic analysis of G in 
the case G is non-abelian. In particular, the dual of a locally compact group is defined to be the primitive ideal space 
of the group C*-algebra. See spectrum of a C*-algebra. 

von Neumann algebras 

von Neumann algebras, known as W* algebras before the 1960s, are a special kind of C*-algebra. They are required 
to be closed in the weak operator topology, which is weaker than the norm topology. Their study is a specialized area 
of functional analysis. 

Properties of C*-algebras 

C*-algebras have a large number of properties that are technically convenient. These properties can be established by 
use the continuous functional calculus or by reduction to commutative C*-algebras. In the latter case, we can use the 
fact that the structure of these is completely determined by the Gelfand isomorphism. 

• The set of elements of a C*-algebra A of the form x*x forms a closed convex cone. This cone is identical to the 
elements of the form x x*. Elements of this cone are called non-negative (or sometimes positive, even though this 
terminology conflicts with its use for elements of R.) 

• The set of self-adjoint elements of a C*-algebra A naturally has the structure of a partially ordered vector space; 
the ordering is usually denoted >. In this ordering, a self-adjoint element x of A satisfies x > 0 if and only if the 
spectrum of x is non-negative. Two self-adjoint elements x and y of A satisfy x > y if x - y > 0. 

• Any C*-algebra A has an approximate identity. In fact, there is a directed family {e^}^ g j of self-adjoint elements 
of A such that 

xe\ — ► x 

0 < e x < < 1 whenever A < fi. 

In case A is separable, A has a sequential approximate identity. More generally, A will have a sequential 
approximate identity if and only if A contains a strictly positive element, i.e. a positive element h such that 
hAh is dense in A. 

• Using approximate identities, one can show that the algebraic quotient of a C* -algebra by a closed proper 
two-sided ideal, with the natural norm, is a C*-algebra. 

• Similarly, a closed two-sided ideal of a C*-algebra is itself a C*-algebra. 



C*-algebra 



175 



Type for C*-algebras 

A C*-algebra A is of type I if and only if for all non-degenerate representations jt of A the von Neumann algebra 
7t(A)" (that is, the bicommutant of Jt(A)) is a type I von Neumann algebra. In fact it is sufficient to consider only 
factor representations, i.e. representations jt for which jt(A)" is a factor. 

A locally compact group is said to be of type I if and only if its group C*-algebra is type I. 

However, if a C*-algebra has non-type I representations, then by results of James Glimm it also has representations 
of type II and type III. Thus for C*-algebras and locally compact groups, it is only meaningful to speak of type I and 
non type I properties. 

C*-algebras and quantum field theory 

In quantum field theory, one typically describes a physical system with a C* -algebra A with unit element; the 
self-adjoint elements of A (elements x with x* = x) are thought of as the observables, the measurable quantities, of 
the system. A state of the system is defined as a positive functional on A (a C-linear map cp : A — > C with cp(w* u) > 
0 for all wGA) such that (p(l) = 1. The expected value of the observable x, if the system is in state cp, is then cp(x). 

See Local quantum physics. 

References 

• W. Arveson, An Invitation to C*-Algebra, Springer- Verlag, 1976. ISBN 0-387-90176-0. An excellent 
introduction to the subject, accessible for those with a knowledge of basic functional analysis. 

• A. Connes, Non-commutative geometry (http://www.alainconnes.org/docs/book94bigpdf.pdf), ISBN 
0-12-185860-X. This book is widely regarded as a source of new research material, providing much supporting 
intuition, but it is difficult. 

• J. Dixmier, Les C*-algebres et leurs representations, Gauthier-Villars, 1969. ISBN 0-7204-0762-1. This is a 
somewhat dated reference, but is still considered as a high-quality technical exposition. It is available in English 
from North Holland press. 

• G. Emch, Algebraic Methods in Statistical Mechanics and Quantum Field Theory, Wiley-Interscience, 1972. 
ISBN 0-471-23900-3. Mathematically rigorous reference which provides extensive physics background. 

• A.I. Shtern (2001), M C* algebra" (http://eom. springer. de/c/c020020. htm), in Hazewinkel, Michiel, 
Encyclopaedia of Mathematics, Springer, ISBN 978-1556080104 

• S. Sakai, C*-algebras and W*-algebras , Springer (1971) ISBN 3-540-63633-1 



Kac-Moody algebra 



176 



Kac-Moody algebra 

In mathematics, a Kac-Moody algebra (named for Victor Kac and Robert Moody, who independently discovered 
them) is a Lie algebra, usually infinite-dimensional, that can be defined by generators and relations through a 
generalized Cartan matrix. These algebras form a generalization of finite-dimensional semisimple Lie algebras, and 
many properties related to the structure of a Lie algebra such as its root system, irreducible representations, and 
connection to flag manifolds have natural analogues in the Kac-Moody setting. A class of Kac-Moody algebras 
called affine Lie algebras is of particular importance in mathematics and theoretical physics, especially conformal 
field theory and the theory of exactly solvable models. Kac discovered an elegant proof of certain combinatorial 
identities, Macdonald identities, which is based on the representation theory of affine Kac-Moody algebras. Garland 
and Lepowski demonstrated that Rogers-Ramanujan identities can be derived in a similar fashion. 



Definition 

A Kac-Moody algebra is given by the following: 

1. An nxn generalized Cartan matrix C = (c.) of rank r. 

2. A vector space {) over the complex numbers of dimension In - r. 

3. A set of n linearly independent elements on of {) and a set of n linearly independent elements a* of the dual 
space, such that = Qj . The on are known as coroots, while the a* are known as roots. 

The Kac-Moody algebra is the Lie algebra 0 defined by generators and fi and the elements of {) and relations 

[^ij fi\ 

• [ e U fj] = 0 f or * ^ 3 

• [e;, x] = a*(x)ei , for x £ \) 

' [fi,x] = -<**(*)/;, for x e f) 

• [x, x'] = Ofor x,x' G fj 

• ad(e i ) 1 -^(e j ) = 0 

• ad(/ i ) 1 -^(/ i ) = 0 

where ad : g — > End(fl), ad(x)(y) = [x, y]is the adjoint representation of 0. 

A real (possibly infinite-dimensional) Lie algebra is also considered a Kac-Moody algebra if its complexification is 
a Kac-Moody algebra. 



Interpretation 

f) is a Cartan subalgebra of the Kac-Moody algebra. 
If g is an element of the Kac-Moody algebra such that 
Vx€ f), [g,x] =u){x)g 

where oo is an element of {)* , then g is said to have weight oo. The Kac-Moody algebra can be diagonalized into 
weight eigenvectors. The Cartan subalgebra h has weight zero, e. has weight a* and/, has weight -a*.. If the Lie 
bracket of two weight eigenvectors is nonzero, then its weight is the sum of their weights. The condition 
[e^ 3 fj] = Ofor i ^ j simply means the a* are simple roots. 



Kac-Moody algebra 



177 



Types of Kac-Moody algebras 

Properties of a Kac-Moody algebra are controlled by the algebraic properties of its generalized Cartan matrix C. In 
order to classify Kac-Moody algebras, it is enough to consider the case of an indecomposable matrixC, that is, 
assume that there is no decomposition of the set of indices / into a disjoint union of non-empty subsets / and / such 
that C.j = 0 for all i in 1^ and j in 1^. Any decomposition of the generalized Cartan matrix leads to the direct sum 
decomposition of the corresponding Kac-Moody algebra: 

fl (C)~ fl(Ci)0 0(C 2 ), 

where the two Kac-Moody algebras in the right hand side are associated with the submatrices of C corresponding to 
the index sets 1^ and l^. 

An important subclass of Kac-Moody algebras corresponds to symmetrizable generalized Cartan matrices C, which 
can be decomposed as DS, where D is a diagonal matrix with positive integer entries and S is a symmetric matrix. 
Under the assumptions that C is symmetrizable and indecomposable, the Kac-Moody algebras are divided into three 
classes: 

• A positive definite matrix S gives rise to a finite-dimensional simple Lie algebra. 

• A positive semidefinite matrix S gives rise to an infinite-dimensional Kac-Moody algebra of affine type, or an 
affine Lie algebra. 

• An indefinite matrix S gives rise to a Kac-Moody algebra of indefinite type. 

• Since the diagonal entries of C and S are positive, S cannot be negative definite or negative semidefinite. 

Symmetrizable indecomposable generalized Cartan matrices of finite and affine type have been completely 
classified. They correspond to Dynkin diagrams and affine Dynkin diagrams. Very little is known about the 
Kac-Moody algebras of indefinite type. Among those, the main focus has been on the (generalized) Kac-Moody 
algebras of hyperbolic type, for which the matrix S is indefinite, but for each proper subset of /, the corresponding 
submatrix is positive definite or positive semidefinite. Such matrices have rank at most 10 and have also been 
completely determined. 

References 

• A. J. Wassermann, Lecture notes on Kac-Moody and Virasoro algebras ^ 

• V. Kac, Infinite dimensional Lie algebras ISBN 0521466938 

• Hazewinkel, Michiel, ed. (2001), "Kac-Moody algebra" , Encyclopaedia of Mathematics, Springer, 
ISBN 978-1556080104 

• V.G. Kac, Simple irreducible graded Lie algebras of finite growth Math. USSR Izv., 2 (1968) pp. 1271-1311, 
Izv. Akad. Nauk USSR Ser. Mat., 32 (1968) pp. 1923-1967 

• R.V. Moody, A new class of Lie algebras J. of Algebra, 10 (1968) pp. 211-230 

External links 

• http ://w w w . emis . de/j ournals/SIGM A/Kac-Moody_algebras . html 

References 

[1] http://arxiv.org/abs/1004. 1287 

[2] http://eom. springer. de/K/k055050.htm 



Spectral theory 



178 



Spectral theory 

In mathematics, spectral theory is an inclusive term for theories extending the eigenvector and eigenvalue theory of 
a single square matrix to a much broader theory of the structure of operators in a variety of mathematical spaces J ^ It 
is a result of studies of linear algebra and the solutions of systems of linear equations and their generalizations. The 
theory is connected to that of analytic functions because the spectral properties of an operator are related to analytic 
functions of the spectral parameter. 



Mathematical background 

The name spectral theory was introduced by David Hilbert in his original formulation of Hilbert space theory, which 
was cast in terms of quadratic forms in infinitely many variables. The original spectral theorem was therefore 
conceived as a version of the theorem on principal axes of an ellipsoid, in an infinite-dimensional setting. The later 
discovery in quantum mechanics that spectral theory could explain features of atomic spectra was therefore 
fortuitous. 

There have been three main ways to formulate spectral theory, all of which retain their usefulness. After Hilbert's 
initial formulation, the later development of abstract Hilbert space and the spectral theory of a single normal operator 
on it did very much go in parallel with the requirements of physics; particularly at the hands of von Neumann ^ The 
further theory built on this to include Banach algebras, which can be given abstractly. This development leads to the 
Gelfand representation, which covers the commutative case, and further into non-commutative harmonic analysis. 

The difference can be seen in making the connection with Fourier analysis. The Fourier transform on the real line is 
in one sense the spectral theory of differentiation qua differential operator. But for that to cover the phenomena one 
has already to deal with generalized eigenfunctions (for example, by means of a rigged Hilbert space). On the other 
hand it is simple to construct a group algebra, the spectrum of which captures the Fourier transform's basic 
properties, and this is carried out by means of Pontryagin duality. 

One can also study the spectral properties of operators on Banach spaces. For example, compact operators on Banach 
spaces have many spectral properties similar to that of matrices. 



Physical background 

The background in the physics of vibrations has been explained in this way:^ 

Spectral theory is connected with the investigation of localized vibrations of a variety of different objects, from atoms and molecules in 
chemistry to obstacles in acoustic waveguides. These vibrations have frequencies, and the issue is to decide when such localized vibrations 
occur, and how to go about computing the frequencies. This is a very complicated problem since every object has not only a fundamental tone 
but also a complicated series of overtones, which vary radically from one body to another. 

The mathematical theory is not dependent on such physical ideas on a technical level, but there are examples of 
mutual influence (see for example Mark Kac's question Can you hear the shape of a drum?). Hilbert's adoption of 
the term "spectrum" has been attributed to an 1897 paper of Wilhelm Wirtinger on Hill's equation (by Jean 
Dieudonne), and it was taken up by his students during the first decade of the twentieth century, among them Erhard 
Schmidt and Hermann Weyl. The conceptual basis for Hilbert space was developed from Hilbert's ideas by Erhard 
Schmidt and Frigyes Riesz. [6] [7] It was almost twenty years later, when quantum mechanics was formulated in terms 
of the Schrodinger equation, that the connection was made to atomic spectra; a connection with the mathematical 
physics of vibration had been suspected before, as remarked by Henri Poincare, but rejected for simple quantitative 
reasons, absent an explanation of the B aimer series. The later discovery in quantum mechanics that spectral theory 
could explain features of atomic spectra was therefore fortuitous, rather than being an object of Hilbert's spectral 
theory. 



Spectral theory 



179 



A definition of spectrum 

Consider a bounded linear transformation T defined everywhere over a general Banach space. We form the 
transformation: 

i? c = (C / - T)- 1 . 

Here / is the identity operator and E, is a complex number. The inverse of an operator T, that is T~\ is defined by: 

rp rp— 1 rji—1 rp j 

If the inverse exists, Tis called regular. If it does not exist, Tis called singular. 

With these definitions, the resolvent set of T is the set of all complex numbers E, such that R exists and is bounded. 
This set often is denoted as q(T). The spectrum of T is the set of all complex numbers C, such that R^ fails to exist or 
is unbounded. Often the spectrum of T is denoted by a(T). The function R for all t> m P(D (that is, wherever R 
exists) is called the resolvent of T. The spectrum of T is therefore the complement of the resolvent set of T in the 
complex plane P^ Every eigenvalue of T belongs to a(T), but a(T) may contain non-eigenvalues J 10] 

This definition applies to a Banach space, but of course other types of space exist as well, for example, topological 

vector spaces include Banach spaces, but can be more general J 1 ^ ^ On the other hand, Banach spaces include 

ri3i 

Hilbert spaces, and it is these spaces that find the greatest application and the richest theoretical results. With 
suitable restrictions, much can be said about the structure of the spectra of transformations in a Hilbert space. In 
particular, for self-adjoint operators, the spectrum lies on the real line and (in general) is a spectral combination of a 
point spectrum of discrete eigenvalues and a continuous spectrum P 4 ^ 

What is spectral theory, roughly speaking? 

In functional analysis and linear algebra the spectral theorem establishes conditions under which an operator can be 
expressed in simple form as a sum of simpler operators. As a full rigorous presentation is not appropriate for this 
article, we take an approach that avoids much of the rigor and satisfaction of a formal treatment with the aim of 
being more comprehensible to a non- specialist. 

This topic is easiest to describe by introducing the bra-ket notation of Dirac for operators/ 15 ^ ^ As an example, a 

ri7i risi 

very particular linear operator L might be written as a dyadic product: 

L=\k 1 ){b 1 \, 

in terms of the "bra" ( b and the "ket" k ) . A function /is described by a ket as / ) . The function fix) defined 
on the coordinates (x jy x^... ) is denoted as: 

/(*) = /) , 

and the magnitude of /by: 



(/, f)= Jdx {/, x)(x, f)= jdx r(x)f(x) , 



where the notation '*' denotes a complex conjugate. This inner product choice defines a very specific inner product 

ri3i 

space, restricting the generality of the arguments that follow. 
The effect of L upon a function /is then described as: 

L\f) = \k 1 ){b,\f) 

expressing the result that the effect of L on / is to produce a new function | k\ ) multiplied by the inner product 
represented by • 

A more general linear operator L might be expressed as: 

L = AildX/il + AaleaX/al + A 3 |e 3 ></ 3 | + . . . , 
where the { X. } are scalars and the { |e^) } are a basis and the { } a reciprocal basis for the space. The 

relation between the basis and the reciprocal basis is described, in part, by: 



Spectral theory 



180 



(fi\ e j) = S ij 

If such a formalism applies, the { } are eigenvalues of L and the functions { | e^) } are eigenfunctions of L. 

ri9i 

The eigenvalues are in the spectrum of L. 

Some natural questions are: under what circumstances does this formalism work, and for what operators L are 
expansions in series of other operators like this possible? Can any function / be expressed in terms of the 
eigenfunctions (are they a complete set) and under what circumstances does a point spectrum or a continuous 
spectrum arise? How do the formalisms for infinite dimensional spaces and finite dimensional spaces differ, or do 
they differ? Can these ideas be extended to a broader class of spaces? Answering such questions is the realm of 
spectral theory and requires considerable background in functional analysis and matrix algebra. 

Resolution of the identity 

This section continues in the rough and ready manner of the above section using the bra-ket notation, and glossing 
over the many important and fascinating details of a rigorous treatment P^ A rigorous mathematical treatment may 
be found in various references. 1 

Using the bra-ket notation of the above section, the identity operator may be written as: 

i=i 

where it is supposed as above that { \q) } are a basis and the { (f { \ } a reciprocal basis for the space satisfying 
the relation: 

{fi\ e j) 

This expression of the identity operation is called a representation or a resolution of the identity P^ This formal 
representation satisfies the basic property of the identity: 

I n = I 

valid for every positive integer n. 

Applying the resolution of the identity to any function in the space , one obtains: 



WHV>) = El e ;></#> = £*> 



i=l i=l 

[221 

which is the generalized Fourier expansion of ip in terms of the basis functions { e }. 
Given some operator equation of the form: 

o\+) = \h) 

with h in the space, this equation can be solved in the above basis through the formal manipulations: 



o|^> = E^(°l«*» = El c *>(/*l fc >. 



i=i i=i 

n ri 

(fjW) = 5></il°l*> = B/ilesX/ilfc) = {fj\h), Vj 

i=l 1=1 

which converts the operator equation to a matrix equation determining the unknown coefficients c in terms of the 
generalized Fourier coefficients (fj\h) ofh and the matrix elements Oji = (fj\0\e{) of the operator O. 

The role of spectral theory arises in establishing the nature and existence of the basis and the reciprocal basis. In 
particular, the basis might consist of the eigenfunctions of some linear operator L: 

L\ei) = \i\ei) ; 

with the { X. } the eigenvalues of L from the spectrum of L. Then the resolution of the identity above provides the 
dyad expansion of L: 



Spectral theory 



181 



LI = L = J2L\e i )(fi\ = E ^iHfil 
i=i i=i 

Resolvent operator 

Using spectral theory, the resolvent operator R: 

R=(X-L)-\ 

can be evaluated in terms of the eigenf unctions and eigenvalues of L, and the Green's function corresponding to L 
can be found. 

Applying R to some arbitrary function in the space, say cp, 

R\<p) = {\-L)-' \v) = Y»^ 1 ±^\e i ){f h <p). 

This function has poles in the complex A-plane at each eigenvalue of L. Thus, using the calculus of residues: 
1 



2m h d\{\ - L)~ l \<p) = -E? =1 \et) (f h <p) = 
where the line integral is over a contour C that includes all the eigenvalues of L. 
Suppose our functions are defined over some coordinates { x }, that is: 

<z, if) = <p(z u x 2 ,... ), 
where the bra-kets corresponding to { x } satisfy. 

(x, y) =6(x-y), 

and where 6 (x - y) = 6 (x ] - y Jt x 2 - x 3 - ...) is the Dirac delta function. 174 ^ 
Then: 

(*• h £ « A < A - L '" v ) = h £ iX <*■ < A - "> 

= h £ iX j dy {x ' (A " y) {y ' v) 

The function G(x, y; X) defined by: 

G(x, y; A) = (x, (A - L) _1 y) 

= E? =1 E? =1 <x, «)</,, (A - L)~ 1 ej)(fj, y) 
(x, ei)(fi, y) 



— ^1=1 

— ^1=1 



A-A; 
*(*)tf(v) 



A — Aj ' 

T251 

is called the Green's function for operator L, and satisfies: 



^- £ dA G(x, y; A) = -E? =1 (z, y) = -(x, y) = -S(x - y). 



Spectral theory 



182 



Operator equations 

Consider the operator equation: 

(O-XI) \4>) = \h); 

in terms of coordinates: 

J dy(x, (0-XI)y)(y, ^)=h{x). 

A particular case is X = 0. 

The Green's function of the previous section is: 

(y, G(X)z) = (y, (O - A/)" 1 *) = G(y, z; X) , 

and satisfies: 

J dy{x,(0- XI) y)(y, G(X)z) = f dy(x,(0- XI) y)(y, (O - A/)" 1 z) = (x, z) = S(x - z) . 
Using this Green's function property: 

J dy{x,(0- XI) y)G(y, z; A) = 5{x - z) . 
Then, multiplying both sides of this equation by h(z) and integrating: 

J dz h(z) j dy(x, (O- XI) y) G(y, z; A) = J dy (x, ( O - XI) y) J dz h(z) G(y, z; X) = h 

which suggests the solution is: 

V>0) = J dz h(z) G(x, z; A). 

That is, the function ip(x) satisfying the operator equation is found if we can find the spectrum of O, and construct G, 
for example by using: 

There are many other ways to find G, of course P 6 ^ See the articles on Green's functions and on Fredholm integral 
equations. It must be kept in mind that the above mathematics is purely formal, and a rigorous treatment involves 
some pretty sophisticated mathematics, including a good background knowledge of functional analysis, Hilbert 
spaces, distributions and so forth. Consult these articles and the references for more detail. 

Notes 

[1] Jean Alexandre Dieudonne (1981). History of functional analysis (http://books. google, com/books ?id=mg7r4acKgqOC& 

printsec=frontcover). Elsevier. ISBN 0444861483. . 
[2] William Arveson (2002). "Chapter 1: spectral theory and Banach algebras" (http://books. google. com/books ?id=ARdehHGWVlQC). A 

short course on spectral theory. Springer. ISBN 0387953000. . 
[3] Viktor Antonovich Sadovnichii (1991). "Chapter 4: The geometry of Hilbert space: the spectral theory of operators" (http://books. google. 

com/books ?id=SRlQkG60kVEC&pg=PA181&lpg=PA181). Theory of Operators. Springer. ISBN 0306110288. . 
[4] John von Neumann (1996). The mathematical foundations of quantum mechanics; Volume 2 in Princeton Landmarks in Mathematics series 

(http://books. google.com/books ?id=JLyCo3R04qUC&printsec=frontcover&dq=mathematical+foundations+of+quantum+mechanics+ 

inauthor:von+inauthor:neumann&lr=&as_drrb_is^ 

cd=l#v=onepage&q=&f=false) (Reprint of translation of original 1932 ed.). Princeton University Press. ISBN 0691028931. . 
[5] E. Brian Davies, quoted on the King's College London analysis group website "Research at the analysis group" (http://www.kcl.ac.uk/ 

schools/pse/maths/research/analysis/research.html). . 
[6] Nicholas Young (1988). An introduction to Hilbert space (http://books. google. com/books ?id=_igwFHKwcyYC&pg=PA3). Cambridge 

University Press, p. 3. ISBN 0521337178. . 
[7] Jean-Luc Dorier (2000). On the teaching of linear algebra; Vol. 23 of Mathematics education library (http://books.google.com/ 

books ?id=gqZUGMKtNuoC&pg=PA50&dq="thinking+geometa^ 

as_miny_is=&as_maxm_is=0&as_maxy_is=&as_brr=0&cd=l#v=onepage&q="thinking geometrically in Hilbert's "&f=false). Springer. 
ISBN 0792365399. . 



Spectral theory 



183 



[8] Cf. Spectra in mathematics and in physics (http://www.dm.unito.it/personalpages/capietto/Spectra.pdf) by Jean Mawhin, p.4 and pp. 
10-11. 

[9] Edgar Raymond Lorch (2003). Spectral Theory (http://books. google.com/books ?id=X3U2AAAACAAJ&dq=intitle:spectral+ 

intitle:theory+inauthor:Lorch&lr=&as_drrb_is=q&as_minm_is=0&as_miny_is=&as_maxm_is=0&as_maxy_is=& 

(Reprint of Oxford 1962 ed.). Textbook Publishers, p. 89. ISBN 0758171560. . 
[10] Nicholas Young, op. cit (http://books.google.com/books ?id=_igwFHKwcyYC&pg=PA81). p. 81. ISBN 0521337178. . 
[11] Helmut H. Schaefer, Manfred P. H. Wolff (1999). Topological vector spaces (http://books.google.com/books ?id=9kXY742pABoC& 

pg=PA36) (2nd ed.). Springer, p. 36. ISBN 0387987266. . 
[12] Dmitrii Petrovich Zhelobenko (2006). Principal structures and methods of representation theory (http://books.google.com/ 

books?id=3TkmvZktjp8C&pg=PA24). American Mathematical Society. ISBN 0821837311. . 
[13] Edgar Raymond Lorch (2003). "Chapter III: Hilbert Space" (http://books.google.com/books ?id=X3U2AAAACAAJ& 

dq=intitle : spectral+intitle : theory +im 

as_brr=0&cd=l). op. cit.. p. 57. ISBN 0758171560. . 
[14] Edgar Raymond Lorch (2003). "Chapter V: The Structure of Self-Adjoint Transformations" (http://books.google.com/ 

books ?id=X3U2AAAACAAJ&dq=intitle:spectral+intitle:theory 

as_maxm_is=0&as_maxy_is=&as_brr=0&cd=l). op. cit.. p. 106 ff. ISBN 0758171560. . 
[15] Bernard Friedman (1990). Principles and Techniques of Applied Mathematics (http://books. google.com/books ?id=gnQeAQAAIAAJ& 
dq=intitle : applied+intitle : mathemati 

as_maxy_is=&as_brr=0&cd=l) (Reprint of 1956 Wiley ed.). Dover Publications, p. 26. ISBN 0486664449. . 
[16] PAM Dirac (1981). The principles of quantum mechanics (http://books. google. com/books ?id=XehUpGiM6FIC&pg=PA29) (4rth ed.). 

Oxford University Press, p. 29 ff. ISBN 0198520115. . 
[17] Jtirgen Audretsch (2007). "Chapter 1.1.2: Linear operators on the Hilbert space" (http://books. google.com/books ?id=8NxIgwAOU6IC& 

pg=PA5). Entangled systems: new directions in quantum physics. Wiley-VCH. p. 5. ISBN 3527406840. . 
[18] R. A. Howland (2006). Intermediate dynamics: a linear algebraic approach (http://books. google.com/books ?id=SepP8-W3M0AC& 

pg=PA69&dq=dyad+representation+operator&lr=&as_drrb_is=q&as 

as_brr=0&cd=32#v=onepage&q=dyad representation operator&f=false) (2nd ed.). Birkhauser. p. 69 ff. ISBN 0387280596. . 
[19] Bernard Friedman (1990). "Chapter 2: Spectral theory of operators" (http://books. google. com/books ?id=gnQeAQAAIAAJ& 
dq=intitle : applied+intitle : mathemati 

as_maxy_is=&as_brr=0&cd=l). op. cit.. p. 57. ISBN 0486664449. . 
[20] See discussion in Dirac's book referred to above, and Milan Vujicic (2008). Linear algebra thoroughly explained (http://books. google. 

com/books 7id=pifStNLaXGkC&pg=PA274). Springer, p. 274. ISBN 3540746374. . 
[21] See, for example, the fundamental text of John von Neumann, op. cit (http://books. google.com/books ?id=JLyCo3R04qUC& 

printsec=frontcover). ISBN 0691028931. . and Arch W. Naylor, George R. Sell (2000). Linear Operator Theory in Engineering and Science; 

Vol. 40 of Applied mathematical science (http://books. google. com/books ?id=t3SXs4-KrE0C&pg=PA401). Springer, p. 401. 

ISBN 038795001X. ., Steven Roman (2008). Advanced linear algebra (http://books.google.com/books 7id=bSyQr-wUys8C&pg=PA233) 

(3rd ed.). Springer. ISBN 0387728287. ., iDuDrii Makarovich Berezanskii (1968). Expansions in eigenfunctions of self adjoint operators; Vol. 

17 in Translations of mathematical monographs (http://books. google. com/books ?id=OPPWBE3WQqkC&pg=PA3 17). American 

Mathematical Society. ISBN 0821815679. . 
[22] See for example, Gerald B Folland (2009). "Convergence and completeness" (http://books. google. com/books ?id=idAomhpwI8MC& 

pg=PA77). Fourier Analysis and its Applications (Reprint of Wadsworth & Brooks/Cole 1992 ed.). American Mathematical Society, pp. 77 ff. 

ISBN 0821847902. . 

[23] PAM Dirac. op. cit (http://books.google.com/books?id=XehUpGiM6FIC&pg=PA65#v=onepage&q=&f=false). p. 65 ff. 
ISBN 0198520115. . 

[24] PAM Dirac. op. cit (http://books.google.com/books?id=XehUpGiM6FIC&pg=PA60#v=onepage&q=&f=false). p. 60 ff. 
ISBN 0198520115. . 

[25] Bernard Friedman, op. cit (http://books. google. com/books ?id=gnQeAQAAIAAJ&dq=intitle:applied+intitle:mathem 

inauthor:Friedman&lr=&as_drrb_is=q&as_minmJ^ p. 214, Eq. 

2.14. ISBN 0486664449. . 

[26] For example, see Sadri Hassani (1999). "Chapter 20: Green's functions in one dimension" (http://books.google.com/ 

books ?id=BCMLOp6DyFIC&pg=RAl-PA553). Mathematical physics: a modern introduction to its foundations . Springer, p. 553 et seq. 
ISBN 0387985794. . and Qing-Hua Qin (2007). Green's function and boundary elements of multifield materials (http://books.google.com/ 
books ?id=UUfy8CcJiDkC&printsec=frontcover). Elsevier. ISBN 0080451349. . 



Spectral theory 



184 



General references 

• Hazewinkel, Michiel, ed. (2001), "Spectral theory of linear operators" (http://eom. springer. de/S/s086520. 
htm), Encyclopaedia of Mathematics, Springer, ISBN 978-1556080104 

• Nelson Dunford; Jacob T Schwartz (1988). Linear Operators, Spectral Theory, Self Adjoint Operators in Hilbert 
Space (Part 2) (http://books.google.com/books ?id=eOFfQQAACAAJ&dq=isbn:0471608475&cd=l) 
(Paperback reprint of 1967 ed.). Wiley. ISBN 0471608475. 

• Nelson Dunford; Jacob T Schwartz (1988). Linear Operators, Spectral Operators (Part 3) (http: //books. google. 
conVbooks?id=B0SeJNIh3BwC&printsec=frontcover&dq=isbn:0471608467&cd=l#v=onepage&q=& 
(Paperback reprint of 1971 ed.). Wiley. ISBN 0471608467. 

• Sadri Hassani (1999). "Chapter 4: Spectral decomposition" (http://books.google.com/ 

books ?id=BCMLOp6DyFIC&pg=RAl-PA109#v=onepage&q=&f=false). Mathematical physics: a modern 
introduction to its foundations . Springer. ISBN 0387985794. 

• Edward Brian Davies (1996). Spectral Theory and Differential Operators; Volume 42 in the Cambridge Studies 
in Advanced Mathematics (http: //books. google. com/books ?id=EXtKu J AksSUC&printsec=frontcover& 
dq=intitle : Spectral+intitle : Theory +intitle : and+intitle : Diff erential+intitle : Operators&lr=&as_drrb_is=q& 
as_minm_is=0&as_miny_is=&as_maxm_is=0&as_maxy_is=&as3rr=0&cd=l#v=onepage&q=Spectral 
theory &f=false). Cambridge University Press. ISBN 0521587107. 

• Arch W. Naylor, George R. Sell (2000). "Chapter 5, Part B: The Spectrum" (http://books.google.com/ 
books?id=t3SXs4-KrE0C&pg=PA4 1 1 &dq="resolution+of+the+identity "&lr=&as_drrb_is=q& 
as_minm_is=0&as_miny_is=&as_maxm_is=0&as of 
the identity "&f=false). Linear Operator Theory in Engineering and Science; Volume 40 of Applied 
mathematical sciences. Springer, p. 411. ISBN 038795001X. 

• Shmuel Kantorovitz (1983). Spectral Theory ofBanach Space Operators;. Springer. 

External links 

• Evans M. Harrell II (http://www.mathphysics.com/opthy/OpHistory.html): A Short History of Operator 
Theory 

• Gregory H. Moore (1995). "The axiomatization of linear algebra: 1875-1940". Historia Mathematica 22: 
262-303. doi: 10. 1006/hmat. 1995. 1025. 



185 



Quantum Field Theory, SUSY, Quantum 
Geometry and Quantum Algebraic Topology 

Quantum electrodynamics 

Quantum electrodynamics (QED) is the relativistic quantum field theory of electrodynamics. In essence, it 
describes how light and matter interact and is the first theory where full agreement between quantum mechanics and 
special relativity is achieved. QED mathematically describes all phenomena involving electrically charged particles 
interacting by means of exchange of photons and represents the quantum counterpart of classical electrodynamics 
giving a complete account of matter and light interaction. One of the founding fathers of QED, Richard Feynman, 
has called it "the jewel of physics" for its extremely accurate predictions of quantities like the anomalous magnetic 
moment of the electron, and the Lamb shift of the energy levels of hydrogen 

In technical terms, QED can be described as a perturbation theory of the electromagnetic quantum vacuum. 



History 

The first formulation of a quantum theory describing radiation and matter interaction is due to Paul Adrien Maurice 
Dirac, who, during 1920, was first able to compute the coefficient of spontaneous emission of an atom. 1 




Dirac described the quantization of the electromagnetic field as an ensemble of harmonic 
oscillators with the introduction of the concept of creation and annihilation operators of 
particles. In the following years, with contributions from Wolfgang Pauli, Eugene 
Wigner, Pascual Jordan, Werner Heisenberg and an elegant formulation of quantum 
electrodynamics due to Enrico Fermi, 1 physicists came to believe that, in principle, it 
would be possible to perform any computation for any physical process involving 
photons and charged particles. However, further studies by Felix Bloch with Arnold 
Nordsieck,^ and Victor Weisskopf,^ in 1937 and 1939, revealed that such 
computations were reliable only at a first order of perturbation theory, a problem already 
pointed out by Robert OppenheimerJ 6 ^ At higher orders in the series infinities emerged, 
making such computations meaningless and casting serious doubts on the internal consistency of the theory itself. 
With no solution for this problem known at the time, it appeared that a fundamental incompatibility existed between 
special relativity and quantum mechanics . 

Difficulties with the theory increased through the end of 1940. Improvements in 
microwave technology made it possible to take more precise measurements of the shift 
of the levels of a hydrogen atom, now known as the Lamb shift and magnetic moment 

T81 

of the electron. These experiments unequivocally exposed discrepancies which the 
theory was unable to explain. 

A first indication of a possible way out was given by Hans Bethe. In 1947, while he was 

rm 

traveling by train to reach Schenectady from New York, after giving a talk at the 
conference at Shelter Island on the subject, Bethe completed the first non-relativistic 
computation of the shift of the lines of the hydrogen atom as measured by Lamb and 

Hans Bethe 




Quantum electrodynamics 



186 



RetherfordJ 10 ^ Despite the limitations of the computation, agreement was excellent. The idea was simply to attach 
infinities to corrections at mass and charge that were actually fixed to a finite value by experiments. In this way, the 
infinities get absorbed in those constants and yield a finite result in good agreement with experiments. This 
procedure was named renormalization. 

Based on Bethe's intuition and fundamental 

papers on the subject by Sin-Itiro 

TomonagaJ 11 ^ Julian Schwinger/ 12 ^ ^ 

Richard Feynman^ 14 ^ ^ 15 ^ ^ and Freeman 

Dyson [17] [18] , it was finally possible to get 

fully covariant formulations that were finite 

at any order in a perturbation series of 

quantum electrodynamics. Sin-Itiro 

Tomonaga, Julian Schwinger and Richard 

Feynman were jointly awarded with a Nobel 

prize in physics in 1965 for their work in 

this areaJ 19 ^ Their contributions, and those 

of Freeman Dyson, were about covariant 

and gauge invariant formulations of 

quantum electrodynamics that allow 

computations of observables at any order of 

perturbation theory. Feynman' s 

mathematical technique, based on his 

diagrams, initially seemed very different 

from the field- theoretic, operator-based 

approach of Schwinger and Tomonaga, but 

Freeman Dyson later showed that the two 

ri7i 

approaches were equivalent. 

Renormalization, the need to attach a 
physical meaning at certain divergences 
appearing in the theory through integrals, 
has subsequently become one of the 
fundamental aspects of quantum field theory 

and has come to be seen as a criterion for a theory's general acceptability. Even though renormalization works very 
well in practice, Feynman was never entirely comfortable with its mathematical validity, even referring to 
renormalization as a "shell game" and "hocus pocus".^ 




Shelter Island Conference group photo (Courtesy of Archives, National Academy 

of Sciences). 




Feynman (center) and Oppenheimer (left) 
at Los Alamos. 



QED has served as the model and template for all subsequent quantum field theories. One such subsequent theory is 
quantum chromodynamics, which began in the early 1960s and attained its present form in the 1975 work by H. 
David Politzer, Sidney Coleman, David Gross and Frank Wilczek. Building on the pioneering work of Schwinger, 
Gerald Guralnik, Dick Hagen, and Tom Kibble, Peter Higgs, Jeffrey Goldstone, and others, Sheldon 

Glashow, Steven Weinberg and Abdus Salam independently showed how the weak nuclear force and quantum 
electrodynamics could be merged into a single electroweak force. 



Quantum electrodynamics 



187 



Feynman's view of quantum electrodynamics 



Introduction 

Near the end of his life, Richard P. Feynman gave a series of lectures on QED intended for the lay public. These 
lectures were transcribed and published as Feynman (1985), QED: The strange theory of light and matter} 1 ^ ^ a 
classic non-mathematical exposition of QED from the point of view articulated below. 

The key components of Feynman's presentation of QED are three basic actions. 

• A photon goes from one place and time to another place and time. 

• An electron goes from one place and time to another place and time. 

• An electron emits or absorbs a photon at a certain place and time. 

These actions are represented in a form of 
visual shorthand by the three basic elements 
of Feynman diagrams: a wavy line for the 
photon, a straight line for the electron and a 
junction of two straight lines and a wavy 
one for a vertex representing emission or 
absorption of a photon by an electron. These 
may all be seen in the adjacent diagram. 



Time 



B 



D 




A 

PHOTON 



ELECTRON 



PHOTON EMISSION OR ABSORPTION 



Space 



Feynman diagram elements 



It is important not to over-interpret these 
diagrams. Nothing is implied about how a 
particle gets from one point to another. The 
diagrams do not imply that the particles are 
moving in straight or curved lines. They do 
not imply that the particles are moving with 
fixed speeds. The fact that the photon is often represented, by convention, by a wavy line and not a straight one does 
not imply that it is thought that it is more wavelike than is an electron. The images are just symbols to represent the 
actions above: photons and electrons do, somehow, move from point to point and electrons, somehow, emit and 
absorb photons. We do not know how these things happen, but the theory tells us about the probabilities of these 
things happening. Trajectory is a meaningless concept in quantum mechanics. 

As well as the visual shorthand for the actions Feynman introduces another kind of shorthand for the numerical 
quantities which tell us about the probabilities. If a photon moves from one place and time - in shorthand, A - to 
another place and time — shorthand, B - the associated quantity is written in Feynman's shorthand as P(A to B). The 
similar quantity for an electron moving from C to D is written E(C to D). The quantity which tells us about the 
probability for the emission or absorption of a photon he calls 'j'. This is related to, but not the same as, the measured 
electron charge 'e'. 

QED is based on the assumption that complex interactions of many electrons and photons can be represented by 
fitting together a suitable collection of the above three building blocks, and then using the probability-quantities to 
calculate the probability of any such complex interaction. It turns out that the basic idea of QED can be 
communicated while making the assumption that the quantities mentioned above are just our everyday probabilities. 
(A simplification of Feynman's book.) Later on this will be corrected to include specifically quantum mathematics, 
following Feynman. 

The basic rules of probabilities that will be used are that a) if an event can happen in a variety of different ways then 
its probability is the sum of the probabilities of the possible ways and b) if a process involves a number of 
independent subprocesses then its probability is the product of the component probabilities. 



Quantum electrodynamics 



188 



Basic constructions 




Suppose we start with one electron at a certain place and time (this place and time being given the arbitrary label A) 
and a photon at another place and time (given the label B). A typical question from a physical standpoint is: 'What is 
the probability of finding an electron at C (another place and a later time) and a photon at D (yet another place and 
time)?'. The simplest process to achieve this end is for the electron to move from A to C (an elementary action) and 
that the photon moves from B to D (another elementary action). From a knowledge of the probabilities of each of 
these subprocesses - E(A to C) and P(B to D) - then we would expect to calculate the probability of both happening 
by multiplying them, using rule b) above. This gives a simple estimated answer to our question. 

But there are other ways in which the end result could come about. 
The electron might move to a place and time E where it absorbs 
the photon; then move on before emitting another photon at F; 
then move on to C where it is detected, while the new photon 
moves on to D. The probability of this complex process can again 
be calculated by knowing the probabilities of each of the 
individual actions: three electron actions, two photon actions and 
two vertexes - one emission and one absorption. We would expect 
to find the total probability by multiplying the probabilities of each 
of the actions, for any chosen positions of E and F. We then, using 
rule a) above, have to add up all these probabilities for all the 
alternatives for E and F. (This is not elementary in practice, and 
involves integration.) But there is another possibility: that is that the electron first moves to G where it emits a 
photon which goes on to D, while the electron moves on to H, where it absorbs the first photon, before moving on to 
C. Again we can calculate the probability of these possibilities (for all points G and H). We then have a better 
estimation for the total probability by adding the probabilities of these two possibilities to our original simple 
estimate. Incidentally the name given to this process of a photon interacting with an electron in this way is Compton 
Scattering. 



Compton scattering 



There are an infinite number of other intermediate processes in which more and more photons are absorbed and/or 
emitted. For each of these possibilities there is a Feynman diagram describing it. This implies a complex 
computation for the resulting probabilities, but provided it is the case that the more complicated the diagram the less 
it contributes to the result, it is only a matter of time and effort to find as accurate an answer as one wants to the 
original question. This is the basic approach of QED. To calculate the probability of any interactive process between 
electrons and photons it is a matter of first noting, with Feynman diagrams, all the possible ways in which the 
process can be constructed from the three basic elements. Each diagram involves some calculation involving definite 
rules to find the associated probability. 

That basic scaffolding remains when one moves to a quantum description but some conceptual changes are 
requested. One is that whereas we might expect in our everyday life that there would be some constraints on the 
points to which a particle can move, that is not true in full quantum electrodynamics. There is a certain possibility of 
an electron or photon at A moving as a basic action to any other place and time in the universe. That includes places 
that could only be reached at speeds greater than that of light and also earlier times. (An electron moving backwards 
in time can be viewed as a positron moving forward in time.) 



Quantum electrodynamics 



189 



Probability amplitudes 




Addition of probability amplitudes as complex 
numbers 



Quantum mechanics introduces an important change on the way 
probabilities are computed. It has been found that the quantities 
which we have to use to represent the probabilities are not the 
usual real numbers we use for probabilities in our everyday world, 
but complex numbers which are called probability amplitudes. 
Feynman avoids exposing the reader to the mathematics of 
complex numbers by using a simple but accurate representation of 
them as arrows on a piece of paper or screen. (These must not be 
confused with the arrows of Feynman diagrams which are actually 
simplified representations in two dimensions of a relationship 
between points in three dimensions of space and one of time.) The 

amplitude-arrows are fundamental to the description of the world given by quantum theory. No satisfactory reason 
has been given for why they are needed. But pragmatically we have to accept that they are an essential part of our 
description of all quantum phenomena. They are related to our everyday ideas of probability by the simple rule that 
the probability of an event is the square of the length of the corresponding amplitude-arrow. So, for a given process, 
if two probability amplitudes, v and w, are involved, the probability of the process will be given either by 

P = |v + w| 2 

or 

p = | v x w| 2 - 

The rules as regards adding or multiplying, however, are the same as above. But where you would expect to add or 
multiply probabilities, instead you add or multiply probability amplitudes that now are complex numbers. 

Addition and multiplication are familiar operations in the theory of 
complex numbers and are given in the figures. The sum is found as 
follows. Let the start of the second arrow be at the end of the first. 
The sum is then a third arrow that goes directly from the start of 
the first to the end of the second. The product of two arrows is an 
arrow whose length is the product of the two lengths. The 
direction of the product is found by adding the angles that each of 
the two have been turned through relative to a reference direction: 
that gives the angle that the product is turned relative to the 
reference direction. 

That change, from probabilities to probability amplitudes, 
complicates the mathematics without changing the basic approach. But that change is still not quite enough because 
it fails to take into account the fact that both photons and electrons can be polarized, which is to say that their 
orientation in space and time have to be taken into account. Therefore P(A to B) actually consists of 16 complex 
numbers, or probability amplitude arrows. There are also some minor changes to do with the quantity "j", which may 
have to be rotated by a multiple of 90° for some polarizations, which is only of interest for the detailed bookkeeping. 

Associated with the fact that the electron can be polarized is another small necessary detail which is connected with 
the fact that an electron is a Fermion and obeys Fermi-Dirac statistics. The basic rule is that if we have the 
probability amplitude for a given complex process involving more than one electron, then when we include (as we 
always must) the complementary Feynman diagram in which we just exchange two electron events, the resulting 
amplitude is the reverse — the negative — of the first. The simplest case would be two electrons starting at A and B 
ending at C and D. The amplitude would be calculated as the "difference", E(A to B)xE(C to D) — E(A to C)xE(B to 
D), where we would expect, from our everyday idea of probabilities, that it would be a sum. 



VxW 


iR 


w 


V 






R o 


1 


Multiplication of probability amplitudes as complex 
numbers 



Quantum electrodynamics 



190 



Propagators 

Finally, one has to compute P(A to B) and E (C to D) corresponding to the probability amplitudes for the photon and 
the electron respectively. These are essentially the solutions of the Dirac Equation which describes the behavior of 
the electron's probability amplitude and the Klein-Gordon equation which describes the behavior of the photon's 
probability amplitude. These are called Feynman propagators. The translation to a notation commonly used in the 
standard literature is as follows: 

P(A to B) -> D F (x B - x A ), E(C to D) -> S F (x D - x c ) 
where a shorthand symbol such as Xj± stands for the four real numbers which give the time and position in three 
dimensions of the point labeled A. 



Mass renormalization 



A problem arose historically which held up progress for twenty 
years: although we start with the assumption of three basic 
"simple" actions, the rules of the game say that if we want to 
calculate the probability amplitude for an electron to get from A to 
B we must take into account all the possible ways: all possible 
Feynman diagrams with those end points. Thus there will be a way 
in which the electron travels to C, emits a photon there and then 
absorbs it again at D before moving on to B. Or it could do this 
kind of thing twice, or more. In short we have a fractal-like 
situation in which if we look closely at a line it breaks up into a 
collection of "simple" lines, each of which, if looked at closely, are 
in turn composed of "simple" lines, and so on ad infinitum. This is 
a very difficult situation to handle. If adding that detail only 
altered things slightly then it would not have been too bad, but 
disaster struck when it was found that the simple correction mentioned above led to infinite probability amplitudes. 
In time this problem was "fixed" by the technique of renormalization (see below and the article on mass 
renormalization). However, Feynman himself remained unhappy about it, calling it a "dippy process" P 0 ^ 




Electron self-energy loop 



Conclusions 

Within the above framework physicists were then able to calculate to a high degree of accuracy some of the 
properties of electrons, such as the anomalous magnetic dipole moment. However, as Feynman points out, it fails 
totally to explain why particles such as the electron have the masses they do. "There is no theory that adequately 
explains these numbers. We use the numbers in all our theories, but we don't understand them — what they are, or 
where they come from. I believe that from a fundamental point of view, this is a very interesting and serious 
problem. "^ 



Quantum electrodynamics 



191 



Mathematics 

Mathematically, QED is an abelian gauge theory with the symmetry group U(l). The gauge field, which mediates 
the interaction between the charged spin- 1/2 fields, is the electromagnetic field. The QED Lagrangian for a spin- 1/2 
field interacting with the electromagnetic field is given by the real part of 

c = v^7 m £>m - - ^ V . 

where 

7^ are Dirac matrices; 

ij; a bispinor field of spin- 1/2 particles (e.g. electron-positron field); 

= i/^7o> called "psi-bar", is sometimes referred to as Dirac adjoint; 
Dp — dp + icAp + ieB^is the gauge covariant derivative; 
e is the coupling constant, equal to the electric charge of the bispinor field; 

is the covariant four-potential of the electromagnetic field generated by the electron itself; 
Bp is the external field imposed by external source; 

— d^Av — d^A^is the electromagnetic field tensor. 

Equations of motion 

To begin, substituting the definition of D into the Lagrangian gives us: 

£ = ijrfW ~ ehM* + - m W> ~ iff***"- 

Next, we can substitute this Lagrangian into the Euler-Lagrange equation of motion for a field: 



to find the field equations for QED. 

The two terms from this Lagrangian are then: 

— = -efa^A* + Bfl ) - 

Substituting these two back into the Euler-Lagrange equation (2) results in: 

iByfrf + e^(A^ + B^)+m^ = 0 
with complex conjugate: 

i-fdyft - ej^A* 1 + B^ip - rmj> = 0. 
Bringing the middle term to the right-hand side transforms this second equation into: 



The left-hand side is like the original Dirac equation and the right-hand side is the interaction with the 
electromagnetic field. 

One further important equation can be found by substituting the Lagrangian into another Euler-Lagrange equation, 
this time for the field, J^ 1 : 



Quantum electrodynamics 



192 



The two terms this time are: 

dC 

— = -e+f* 
and these two terms, when substituted back into (3) give us: 



Interaction picture 

This theory can be straightforwardly quantized treating bosonic and fermionic sectors as free. This permits to build a 
set of asymptotic states to start a computation of the probability amplitudes for different processes. In order to be 
able to do so, we have to compute an evolution operator that, for a given initial state, will give a final state in such a 
way to have 

M fi = (f\U\i). 

This technique is also known as the S-Matrix. Evolution operator is obtained in the interaction picture where time 
evolution is given by the interaction Hamiltonian. So, from equations above is 

V = e J Sx^^A^ 

and so, one has 

U = Texp I--*- f dt f V{t') 
L h J to 

being T the time ordering operator. This evolution operator has only a meaning as a series and what we get here is a 
perturbation series with a development parameter being fine structure constant. This series is named Dyson series. 

Feynman diagrams 

Despite the conceptual clarity of this Feynman approach to QED, almost no textbooks follow him in their 
presentation. When performing calculations it is much easier to work with the Fourier transforms of the propagators. 
Quantum physics considers particle's momenta rather than their positions, and it is convenient to think of particles as 
being created or annihilated when they interact. Feynman diagrams then look the same, but the lines have different 
interpretations. The electron line represents an electron with a given energy and momentum, with a similar 
interpretation of the photon line. A vertex diagram represents the annihilation of one electron and the creation of 
another together with the absorption or creation of a photon, each having specified energies and momenta. 

Using Wick theorem on the terms of the Dyson series, all the terms of the S-matrix for quantum electrodynamics can 
be computed through the technique of Feynman diagrams. In this case rules for drawing are the following 



Quantum electrodynamics 



193 



a fr- /3 




P 2 + 2£ 



-ze7^(27r) 4 ^(p 1+ p 2 +p3)^ 



Incoming antifermion: a • -» s) 

Outgoing fermion: • ^ a -> u a (p,s) 

Outgoing antifermion: • a -> '^a(p>s) 

Incoming photon: ^ *wx^v« -> 6^ (ft, A) 

Outgoing photon: r\-^\-^ ft e^(fc, A)* 

To these rules we must add a further one for closed loops that implies an integration on momenta J cftp / (2tt)^ • 

From them, computations of probability amplitudes are straightforwardly given. An example is Compton scattering, 
with an electron and a photon undergoing elastic scattering. Feynman diagrams are in this case 



Quantum electrodynamics 



194 




and so we are able to get the corresponding amplitude at the first order of a perturbation series for S -matrix: 
M fi = (ie) 2 u(p',sW(k'A') 



( p + fc )2_ m 2 

from which we are able to compute the cross section for this scattering. 



(p — k f ) 2 — m\ 



Renormalizability 

Higher order terms can be straightforwardly computed for the evolution operator but these terms display diagrams 
containing the following simpler ones 




One-loop contribution to the 
vacuum polarization function 

n 



One-loop contribution to the electron 
self-energy function Yj 




One-loop 
contribution 
to the vertex 
function P 



that, being closed loops, imply the presence of diverging integrals having no mathematical meaning. To overcome 
this difficulty, a technique like renormalization has been devised, producing finite results in very close agreement 
with experiments. It is important to note that a criterion for theory being meaningful after renormalization is that the 
number of diverging diagrams is finite. In this case the theory is said renormalizable. The reason for this is that to 
get observables renormalized one needs a finite number of constants to maintain the predictive value of the theory 
untouched. This is exactly the case of quantum electrodynamics displaying just three diverging diagrams. This 
procedure gives observables in very close agreement with experiment as seen e.g. for electron gyromagnetic ratio. 

Renormalizability has become an essential criterion for a quantum field theory to be considered as a viable one. All 
the theories describing fundamental interactions, except gravitation whose quantum counterpart is presently under 
very active research, are renormalizable theories. 



Quantum electrodynamics 



195 



Nonconvergence of series 

An argument by Freeman Dyson shows that the radius of convergence of the perturbation series in QED is zeroJ 24 ^ 
The basic argument goes as follows: if the coupling constant were negative, this would be equivalent to the Coulomb 
force constant being negative. This would "reverse" the electromagnetic interaction so that like charges would attract 
and unlike charges would repel. This would render the vacuum unstable against decay into a cluster of electrons on 
one side of the universe and a cluster of positrons on the other side of the universe. Because the theory is sick for any 
negative value of the coupling constant, the series do not converge, but are an asymptotic series. This can be taken as 
a need for a new theory, a problem with perturbation theory, or ignored by taking a "shut-up-and-calculate" 
approach. 



References 

[I] Feynman, Richard (1985). "Chapter 1". QED: The Strange Theory of Light and Matter. Princeton University Press, p. 6. 
ISBN 978-0691125756. 

[2] P.A.M. Dirac (1927). "The Quantum Theory of the Emission and Absorption of Radiation". Proceedings of the Royal Society of London A 

114: 243-265. doi:10.1098/rspa.l927.0039. 
[3] E. Fermi (1932). "Quantum Theory of Radiation". Reviews of Modern Physics 4: 87-132. doi:10.1103/RevModPhys.4.87. 
[4] F. Bloch; A. Nordsieck (1937). "Note on the Radiation Field of the Electron". Physical Review 52: 54-59. doi:10.1103/PhysRev.52.54. 
[5] V. F. Weisskopf (1939). "On the Self-Energy and the Electromagnetic Field of the Electron". Physical Review 56: 72-85. 

doi:10.1103/PhysRev.56.72. 

[6] R. Oppenheimer (1930). "Note on the Theory of the Interaction of Field and Matter". Physical Review 35: 461-477. 
doi: 1 0. 1 1 03/Phy sRev. 35 .46 1 . 

[7] W. E. Lamb; R. C. Retherford (1947). "Fine Structure of the Hydrogen Atom by a Microwave Method,". Physical Review 72: 241-243. 
doi: 10. 1 103/PhysRev.72.241 . 

[8] P. Kusch; H. M. Foley (1948). "On the Intrinsic Momement of the Electron,". Physical Review 73: 412. doi:10.1103/PhysRev.74.250. 
[9] Schweber, Silvan (1994). "Chapter 5". QED and the Men Who Did it: Dyson, Feynman, Schwinger, and Tomonaga. Princeton University 

Press, p. 230. ISBN 978-0691033273. 
[10] H. Bethe (1947). "The Electromagnetic Shift of Energy Levels". Physical Review 72: 339-341. doi:10.1103/PhysRev.72.339. 

[II] S. Tomonaga (1946). "On a Relativistically Invariant Formulation of the Quantum Theory of Wave Fields". Progress of Theoretical Physics 
1: 27-42. doi:10.1143/PTP.1.27. 

[12] J. Schwinger (1948). "On Quantum-Electrodynamics and the Magnetic Moment of the Electron". Physical Review 73: 416-417. 
doi: 1 0. 1 1 03/Phy sRev.73 .4 1 6. 

[13] J. Schwinger (1948). "Quantum Electrodynamics. I. A Covariant Formulation". Physical Review 74: 1439-1461. 
doi: 10. 1 1 03/PhysRev.74. 1439. 

[14] R. P. Feynman (1949). "Space-Time Approach to Quantum Electrodynamics". Physical Review 76: 769-789. doi:10.1103/PhysRev.76.769. 
[15] R. P. Feynman (1949). "The Theory of Positrons". Physical Review 76: 749-759. doi:10.1103/PhysRev.76.749. 

[16] R. P. Feynman (1950). "Mathematical Formulation of the Quantum Theory of Electromagnetic Interaction". Physical Review 80: 440-457. 
doi: 1 0. 1 1 03/Phy sRev. 80.440. 

[17] F. Dyson (1949). "The Radiation Theories of Tomonaga, Schwinger, and Feynman". Physical Review 75: 486-502. 
doi: 1 0. 1 1 03/Phy sRev.75 .486. 

[18] F. Dyson (1949). "The S Matrix in Quantum Electrodynamics". Physical Review 75: 1736-1755. doi:10.1103/PhysRev.75.1736. 
[19] "The Nobel Prize in Physics 1965" (http://nobelprize.org/nobel_prizes/physics/laureates/1965/index.html). Nobel Foundation. . 
Retrieved 2008-10-09. 

[20] Feynman, Richard (1985). QED: The Strange Theory of Light and Matter. Princeton University Press, p. 128. ISBN 978-0691 125756. 
[21] G.S. Guralnik, C.R. Hagen, T.W.B. Kibble (1964). "Global Conservation Laws and Massless Particles". Physical Review Letters 13: 

585-587. doi:10.1103/PhysRevLett.l3.585. 
[22] G.S. Guralnik (2009). "The History of the Guralnik, Hagen and Kibble development of the Theory of Spontaneous Symmetry Breaking and 

Gauge Particles". International Journal of Modern Physics A 24: 2601-2627. doi: 10. 1142/S021775 1X09045431. arXiv:0907.3466. 
[23] Feynman, Richard (1985). QED: The Strange Theory of Light and Matter. Princeton University Press, p. 152. ISBN 978-0691 125756. 
[24] Kinoshita, Toichiro. "Quantum Electrodynamics has Zero Radius of Convergence Summarized from [[Toichiro Kinoshita (http://www. 

lassp.cornell.edu/sethna/Cracks/QED.html)]"]. . Retrieved 06-10-2010. 



Quantum electrodynamics 



196 



Further reading 
Books 

• De Broglie, Louis (1925). Recherches sur la theorie des quanta [Research on quantum theory]. France: 
Wiley-Interscience. 

• Feynman, Richard Phillips (1998). Quantum Electrodynamics. Westview Press; New Ed edition. 
ISBN 978-0201360752. 

• Jauch, J.M.; Rohrlich, F. (1980). The Theory of Photons and Electrons. Springer- Verlag. ISBN 978-0387072951. 

• Greiner, Walter; Bromley, D.A.,Muller, Berndt. (2000). Gauge Theory of Weak Interactions. Springer. 
ISBN 978-3540676720. 

• Kane, Gordon, L. (1993). Modern Elementary Particle Physics. Westview Press. ISBN 978-0201624601. 

• Miller, Arthur I. (1995). Early Quantum Electrodynamics : A Sourcebook. Cambridge University Press. 
ISBN 978-0521568913. 

• Milonni, Peter W., (1994) The quantum vacuum - an introduction to quantum electrodynamics. Academic Press. 
ISBN 0-12-498080-5 

• Schweber, Silvian, S. (1994). QED and the Men Who Made It. Princeton University Press. 
ISBN 978-0691033273. 

• Schwinger, Julian (1958). Selected Papers on Quantum Electrodynamics. Dover Publications. 
ISBN 978-0486604442. 

• Tannoudji-Cohen, Claude; Dupont-Roc, Jacques, and Grynberg, Gilbert (1997). Photons and Atoms: Introduction 
to Quantum Electrodynamics. Wiley-Interscience. ISBN 978-0471184331. 

Journals 

• Dudley, J.M., and Kwan, A.M. (1996) "Richard Feynman's popular lectures on quantum electrodynamics: The 
1979 Robb Lectures at Auckland University," American Journal of Physics 64: 694-698. 

External links 

• Feynman's Nobel Prize lecture describing the evolution of QED and his role in it (http://nobelprize.org/physics/ 
laureates/1965/feynman-lecture.html) 

• Feynman's New Zealand lectures on QED for non-physicists (http://www.vega.org.Uk/video/subseries/8) 



Quantum field theory 



197 



Quantum field theory 

Quantum field theory (QFT)^ provides a theoretical framework for constructing quantum mechanical models of 
systems classically parametrized (represented) by an infinite number of dynamical degrees of freedom, that is, fields 
and (in a condensed matter context) many-body systems. It is the natural and quantitative language of particle 
physics and condensed matter physics. Most theories in modern particle physics, including the Standard Model of 
elementary particles and their interactions, are formulated as relativistic quantum field theories. Quantum field 
theories are used in many contexts, elementary particle physics being the most vital example, where the particle 
count/number going into a reaction fluctuates and changes, differing from the count/number going out, for example, 
and for the description of critical phenomena and quantum phase transitions, such as in the BCS theory of 
superconductivity, also see phase transition, quantum phase transition, critical phenomena. Quantum field theory is 
thought by many to be the unique and correct outcome of combining the rules of quantum mechanics with special 
relativity. 

In perturbative quantum field theory, the forces between particles are mediated by other particles. The 
electromagnetic force between two electrons is caused by an exchange of photons. Intermediate vector bosons 
mediate the weak force and gluons mediate the strong force. There is currently no complete quantum theory of the 
remaining fundamental force, gravity, but many of the proposed theories postulate the existence of a graviton 
particle that mediates it. These force-carrying particles are virtual particles and, by definition, cannot be detected 
while carrying the force, because such detection will imply that the force is not being carried. In addition, the notion 
of "force mediating particle" comes from perturbation theory, and thus does not make sense in a context of bound 
states. 

In QFT photons are not thought of as 'little billiard balls', they are considered to be field quanta - necessarily 
chunked ripples in a field, or "excitations," that 'look like' particles. Fermions, like the electron, can also be described 
as ripples/excitations in a field, where each kind of fermion has its own field. In summary, the classical visualisation 
of "everything is particles and fields," in quantum field theory, resolves into "everything is particles," which then 
resolves into "everything is fields." In the end, particles are regarded as excited states of a field (field quanta). The 
gravitational field and the electromagnetic field are the only two fundamental fields in Nature that have infinite range 
and a corresponding classical low-energy limit, which greatly diminishes and hides their "particle-like" excitations. 
Albert Einstein, in 1905, attributed "particle-like" and discrete exchanges of momenta and energy, characteristic of 
"field quanta," to the electromagnetic field. Originally, his principal motivation was to explain the thermodynamics 
of radiation. Although it is often claimed that the photoelectric and Compton effects require a quantum description of 
the EM field, this is now understood to be untrue, and proper proof of the quantum nature of radiation is now taken 
up into modern quantum optics as in the antibunching effect . The word "photon" was coined in 1926 by the great 
physical chemist Gilbert Newton Lewis (see also the articles photon antibunching and laser). 

The "low-energy limit" of the correct quantum field-theoretic description of the electromagnetic field, quantum 
electrodynamics, is believed to become James Clerk Maxwell's 1864 theory, although the "classical limit" of 
quantum electrodynamics has not been as widely explored as that of quantum mechanics. Presumably, the as yet 
unknown correct quantum field-theoretic treatment of the gravitational field will become and "look exactly like" 
Einstein's general theory of relativity in the "low-energy limit." Indeed, quantum field theory itself is quite possibly 
the low-energy-effective-field-theory limit of a more fundamental theory such as superstring theory. Compare in this 
context the article effective field theory. 



Quantum field theory 



198 



History 

Quantum field theory originated in the 1920s from the problem of creating a quantum mechanical theory of the 
electromagnetic field. In 1925 Werner Heisenberg, Max Born, and Pascual Jordan constructed such a theory by 
expressing the field's internal degrees of freedom as an infinite set of harmonic oscillators and by employing the 
usual procedure for quantizing those oscillators. This theory assumed that no electric charges or currents were 
present, and today would be called a free field theory. The first reasonably complete theory of quantum 
electrodynamics, which included both the electromagnetic field and electrically charged matter (specifically, 
electrons) as quantum mechanical objects, was created by Paul Dirac in 1927. This quantum field theory could be 
used to model important processes such as the emission of a photon by an electron dropping into a quantum state of 
lower energy, a process in which the number of particles changes — one atom in the initial state becomes an atom 
plus a photon in the final state. It is now understood that the ability to describe such processes is one of the most 
important features of quantum field theory. 

It was evident from the beginning that a proper quantum treatment of the electromagnetic field had to somehow 
incorporate Einstein's relativity theory, which had after all grown out of the study of classical electromagnetism. This 
need to put together relativity and quantum mechanics was the second major motivation in the development of 
quantum field theory. Pascual Jordan and Wolfgang Pauli showed in 1928 that quantum fields could be made to 
behave in the way predicted by special relativity during coordinate transformations (specifically, they showed that 
the field commutators were Lorentz invariant). A further boost for quantum field theory came with the discovery of 
the Dirac equation, which was originally formulated and interpreted as a single-particle equation analogous to the 
Schrodinger equation, but unlike the Schrodinger equation, the Dirac equation satisfies both Lorentz invariance, that 
is, the requirements of special relativity, and the rules of quantum mechanics. The Dirac equation accommodated the 
spin- 1/2 value of the electron and accounted for its magnetic moment as well as giving accurate predictions for the 
spectra of hydrogen. The attempted interpretation of the Dirac equation as a single-particle equation could not be 
maintained long, however, and finally it was shown that several of its undesirable properties (such as 
negative-energy states) could be made sense of by reformulating and reinterpreting the Dirac equation as a true field 
equation, in this case for the quantized "Dirac field" or the "electron field", with the "negative-energy solutions" 
pointing to the existence of anti -particles. This work was performed first by Dirac himself with the invention of hole 
theory 1930 and also by Wendell Furry, Robert Oppenheimer, Vladimir Fock, and others. Schrodinger, during the 
same period that he discovered his famous equation in 1926, also independently found the relativistic generalization 
of it known as the Klein-Gordon equation but dismissed it since, without spin, it predicted impossible properties for 
the hydrogen spectrum. See Oskar Klein, Walter Gordon. All relativistic wave equations that describe spin-zero 
particles are said to be of the Klein-Gordon type. 

A subtle and careful analysis in 1933 and later in 1950 by Niels Bohr and Leon Rosenfeld showed that there is a 
fundamental limitation on the ability to simultaneously measure the electric and magnetic field strengths that enter 
into the description of charges in interaction with radiation, imposed by the uncertainty principle, which must apply 
to all canonically conjugate quantities. This limitation is crucial for the successful formulation and interpretation of a 
quantum field theory of photons and electrons(quantum electrodynamics), and indeed,any perturbative quantum field 
theory. The analysis of Bohr and Rosenfeld explains fluctuations in the values of the electromagnetic field that differ 
from the classically "allowed" values distant from the sources of the field. Their analysis was crucial to showing that 
the limitations and physical implications of the uncertainty principle apply to all dynamical systems, whether fields 
or material particles. Their analysis also convinced most people that any notion of returning to a fundamental 
description of nature based on classical field theory, such as what Einstein aimed at with his numerous and failed 
attempts at a classical unified field theory, was simply out of the question. 

The third thread in the development of quantum field theory was the need to handle the statistics of many-particle 
systems consistently and with ease. In 1927 Jordan tried to extend the canonical quantization of fields to the 
many-body wave functions of identical particles, a procedure that is sometimes called second quantization. In 1928, 



Quantum field theory 



199 



Jordan and Eugene Wigner found that the quantum field describing electrons, or other fermions, had to be expanded 
using anti-commuting creation and annihilation operators due to the Pauli exclusion principle. This thread of 
development was incorporated into many-body theory and strongly influenced condensed matter physics and nuclear 
physics. 

Despite its early successes quantum field theory was plagued by several serious theoretical difficulties. Basic 
physical quantities, such as the self-energy of the electron, the energy shift of electron states due to the presence of 
the electromagnetic field, gave infinite, divergent contributions — a nonsensical result — when computed using the 
perturbative techniques available in the 1930s and most of the 1940s. The electron self-energy problem was already 
a serious issue in the classical electromagnetic field theory, where the attempt to attribute to the electron a finite size 
or extent (the classical electron-radius) led immediately to the question of what non-electromagnetic stresses would 
need to be invoked, which would presumably hold the electron together against the Coulomb repulsion of its 
finite-sized "parts". The situation was dire, and had certain features that reminded many of the "Ray leigh- Jeans 
difficulty". What made the situation in the 1940s so desperate and gloomy, however, was the fact that the correct 
ingredients (the second-quantized Maxwell-Dirac field equations) for the theoretical description of interacting 
photons and electrons were well in place, and no major conceptual change was needed analogous to that which was 
necessitated by a finite and physically sensible account of the radiative behavior of hot objects, as provided by the 
Planck radiation law. 

This "divergence problem" was solved in the case of quantum electrodynmaics during the late 1940s and early 1950s 
by Hans Bethe, Tomonaga, Schwinger, Feynman, and Dyson, through the procedure known as renormalization. 
Great progress was made after realizing that ALL infinities in quantum electrodynamics are related to two effects: 
the self-energy of the electron/positron and vacuum polarization. Renormalization concerns the business of paying 
very careful attention to just what is meant by, for example, the very concepts "charge" and "mass" as they occur in 
the pure, non-interacting field-equations. The "vacuum" is itself polarizable and, hence, populated by virtual particle 
(on shell and off shell) pairs, and, hence, is a seething and busy dynamical system in its own right. This was a critical 
step in identifying the source of "infinities" and "divergences". The "bare mass" and the "bare charge" of a particle, 
the values that appear in the free-field equations (non-interacting case), are abstractions that are simply not realized 
in experiment (in interaction). What we measure, and hence, what we must take account of with our equations, and 
what the solutions must account for, are the "renormalized mass" and the "renormalized charge" of a particle. That is 
to say, the "shifted" or "dressed" values these quantities must have when due care is taken to include all deviations 
from their "bare values" is dictated by the very nature of quantum fields themselves. 

The first approach that bore fruit is known as the "interaction representation," (see the article Interaction picture) a 
Lorentz covariant and gauge-invariant generalization of time-dependent perturbation theory used in ordinary 
quantum mechanics, and developed by Tomonaga and Schwinger, generalizing earlier efforts of Dirac, Fock and 
Podolsky. Tomonaga and Schwinger invented a relativistically covariant scheme for representing field commutators 
and field operators intermediate between the two main representations of a quantum system, the Schrodinger and the 
Heisenberg representations (see the article on quantum mechanics). Within this scheme, field commutators at 
separated points can be evaluated in terms of "bare" field creation and annihilation operators. This allows for keeping 
track of the time-evolution of both the "bare" and "renormalized", or perturbed, values of the Hamiltonian and 
expresses everything in terms of the coupled, gauge invariant "bare" field-equations. Schwinger gave the most 
elegant formulation of this approach. The next and most famous development is due to Feynman, who, with his 
brilliant rules for assigning a "graph"/" diagram" to the terms in the scattering matrix (See S-Matrix Feynman 
diagrams). These directly corresponded (through the Schwinger-Dyson equation) to the measurable physical 
processes (cross sections, probability amplitudes, decay widths and lifetimes of excited states) one needs to be able 
to calculate. This revolutionized how quantum field theory calculations are carried-out in practice. 

Two classic text-books from the 1960s, J.D. Bjorken and S.D. Drell, Relativistic Quantum Mechanics (1964) and J.J. 
Sakurai, Advanced Quantum Mechanics (1967), thoroughly developed the Feynman graph expansion techniques 



Quantum field theory 



200 



using physically intuitive and practical methods following from the correspondence principle, without worrying 
about the technicalities involved in deriving the Feynman rules from the superstructure of quantum field theory 
itself. Although both Feynman' s heuristic and pictorial style of dealing with the infinities, as well as the formal 
methods of Tomonaga and Schwinger, worked extremely well, and gave spectacularly accurate answers, the true 
analytical nature of the question of "renormalizability", that is, whether ANY theory formulated as a "quantum field 
theory" would give finite answers, was not worked-out till much later, when the urgency of trying to formulate finite 
theories for the strong and electro-weak (and gravitational interactions) demanded its solution. 

Renormalization in the case of QED was largely fortuitous due to the smallness of the coupling constant, the fact that 
the coupling has no dimensions involving mass, the so-called fine structure constant, and also the zero-mass of the 
gauge boson involved, the photon, rendered the small-distance/high-energy behavior of QED manageable. Also, 
electromagnetic processess are very "clean" in the sense that they are not badly suppressed/damped and/or hidden by 
the other gauge interactions. By 1958 Sidney Drell observed: "Quantum electrodynamics (QED) has achieved a 
status of peaceful coexistence with its divergences...." 

The unification of the electromagnetic force with the weak force encountered initial difficulties due to the lack of 
accelerator energies high enough to reveal processes beyond the Fermi interaction range. Additionally, a satisfactory 
theoretical understanding of hadron substructure had to be developed, culminating in the quark model. 

In the case of the strong interactions, progress concerning their short-distance/high-energy behavior was much 
slower and more frustrating. For strong interactions with the electro- weak fields, there were difficult issues regarding 
the strength of coupling, the mass generation of the force carriers as well as their non-linear, self interactions. 
Although there has been theoretical progress toward a grand unified quantum field theory incorporating the 
electro-magnetic force, the weak force and the strong force, empirical verification is still pending. Superunification, 
incorporating the gravitational force, is still very speculative, and is under intensive investigation by many of the best 
minds in contemporary theoretical physics. Gravitation is a tensor field description of a spin-2 gauge-boson, the 
"graviton", and is further discussed in the articles on general relativity and quantum gravity. 

From the point of view of the techniques of (four-dimensional) quantum field theory, and as the numerous and heroic 
efforts to formulate a consistent quantum gravity theory by some very able minds attests, gravitational quantization 
was, and is still, the reigning champion for bad behavior. There are problems and frustrations stemming from the fact 
that the gravitational coupling constant has dimensions involving inverse powers of mass, and as a simple 
consequence, it is plagued by badly behaved (in the sense of perturbation theory) non-linear and violent 
self-interactions. Gravity, basically, gravitates, which in turn... gravitates... and so on, (i.e., gravity is itself a source of 
gravity,...,) thus creating a nightmare at all orders of perturbation theory. Also, gravity couples to all energy equally 
strongly, as per the equivalence principle, so this makes the notion of ever really "switching-off", "cutting-off" or 
separating, the gravitational interaction from other interactions ambiguous and impossible since, with gravitation, we 
are dealing with the very structure of space-time itself. (See general covariance and, for a modest, yet highly 
non-trivial and significant interplay between (QFT) and gravitation (spacetime), see the article Hawking radiation 
and references cited therein. Also quantum field theory in curved spacetime). 

Thanks to the somewhat brute-force, clanky and heuristic methods of Feynman, and the elegant and abstract methods 
of Tomonaga/Schwinger, from the period of early renormalization, we do have the modern theory of quantum 
electrodynamics (QED). It is still the most accurate physical theory known, the prototype of a successful quantum 
field theory. Beginning in the 1950s with the work of Yang and Mills, as well as Ryoyu Utiyama, following the 
previous lead of Weyl and Pauli, deep explorations illuminated the types of symmetries and in variances any field 
theory must satisfy. QED, and indeed, all field theories, were generalized to a class of quantum field theories known 
as gauge theories. Quantum electrodynamics is the most famous example of what is known as an Abelian gauge 
theory. It relies on the symmetry group U(l) and has one massless gauge field, the U(l) gauge symmetry, dictating 
the form of the interactions involving the electromagnetic field, with the photon being the gauge boson. That 
symmetries dictate, limit and necessitate the form of interaction between particles is the essence of the "gauge theory 



Quantum field theory 



201 



revolution." Yang and Mills formulated the first explicit example of a non-Abelian gauge theory, Yang-Mills theory, 
with an attempted explanation of the strong interactions in mind. The strong interactions were then (incorrectly) 
understood in the mid-1950s, to be mediated by the pi-mesons, the particles predicted by Hideki Yukawa in 1935, 
based on his profound reflections concerning the reciprocal connection between the mass of any force-mediating 
particle and the range of the force it mediates. This was allowed by the uncertainty principle. The 1960s and 1970s 
saw the formulation of a gauge theory now known as the Standard Model of particle physics, which systematically 
describes the elementary particles and the interactions between them. 

The electroweak interaction part of the standard model was formulated by Sheldon Glashow in the years 1958-60 
with his discovery of the SU(2)xU(l) group structure of the theory. Steven Weinberg and Abdus Salam brilliantly 
invoked the Anderson-Higgs mechanism for the generation of the W's and Z masses (the intermediate vector 
boson(s) responsible for the weak interactions and neutral-currents) and keeping the mass of the photon zero. The 
Goldstone/Higgs idea for generating mass in gauge theories was sparked in the late 1950s and early 1960s when a 
number of theoreticians (including Yoichiro Nambu, Steven Weinberg, Jeffrey Goldstone, Frangois Englert, Robert 
Brout, G. S. Guralnik, C. R. Hagen, Tom Kibble and Philip Warren Anderson) noticed a possibly useful analogy to 
the (spontaneous) breaking of the U(l) symmetry of electromagnetism in the formation of the BCS ground-state of a 
superconductor. The gauge boson involved in this situation, the photon, behaves as though it has acquired a finite 
mass. There is a further possibility that the physical vacuum (ground- state) does not respect the symmetries implied 
by the "unbroken" electroweak Lagrangian (see the article Electroweak interaction for more details) from which one 
arrives at the field equations. The electroweak theory of Weinberg and Salam was shown to be renormalizable 
(finite) and hence consistent by Gerardus 't Hooft and Martinus Veltman. The Glashow- Weinberg-Salam theory 
(GWS-Theory) is a triumph and, in certain applications, gives an accuracy on a par with quantum electrodynamics. 

Also during the 1970s parallel developments in the study of phase transitions in condensed matter physics led Leo 
Kadanoff, Michael Fisher and Kenneth Wilson (extending work of Ernst Stueckelberg, Andre Peterman, Murray 
Gell-Mann, and Francis Low) to a set of ideas and methods known as the renormalization group. By providing a 
better physical understanding of the renormalization procedure invented in the 1940s, the renormalization group 
sparked what has been called the "grand synthesis" of theoretical physics, uniting the quantum field theoretical 
techniques used in particle physics and condensed matter physics into a single theoretical framework. 

The study of quantum field theory is alive and flourishing, as are applications of this method to many physical 
problems. It remains one of the most vital areas of theoretical physics today, providing a common language to many 
branches of physics. 

Principles of quantum field theory 
Classical fields and quantum fields 

Quantum mechanics, in its most general formulation, is a theory of abstract operators (observables) acting on an 
abstract state space (Hilbert space), where the observables represent physically observable quantities and the state 
space represents the possible states of the system under study. Furthermore, each observable corresponds, in a 
technical sense, to the classical idea of a degree of freedom. For instance, the fundamental observables associated 
with the motion of a single quantum mechanical particle are the position and momentum operators x an d p . 
Ordinary quantum mechanics deals with systems such as this, which possess a small set of degrees of freedom. 
(It is important to note, at this point, that this article does not use the word "particle" in the context of wave-particle 
duality. In quantum field theory, "particle" is a generic term for any discrete quantum mechanical entity, such as an 
electron or photon, which can behave like classical particles or classical waves under different experimental 
conditions.) 

A quantum field is a quantum mechanical system containing a large, and possibly infinite, number of degrees of 
freedom. A classical field contains a set of degrees of freedom at each point of space; for instance, the classical 



Quantum field theory 



202 



electromagnetic field defines two vectors — the electric field and the magnetic field — that can in principle take on 
distinct values for each position r . When the field as a whole is considered as a quantum mechanical system, its 
observables form an infinite (in fact uncountable) set, because r is continuous. 

Furthermore, the degrees of freedom in a quantum field are arranged in "repeated" sets. For example, the degrees of 
freedom in an electromagnetic field can be grouped according to the position r , with exactly two vectors for each 
r . Note that r is an ordinary number that "indexes" the observables; it is not to be confused with the position 
operator x encountered in ordinary quantum mechanics, which is an observable. (Thus, ordinary quantum 
mechanics is sometimes referred to as "zero-dimensional quantum field theory", because it contains only a single set 
of observables.) 

It is also important to note that there is nothing special about r because, as it turns out, there is generally more than 
one way of indexing the degrees of freedom in the field. 

In the following sections, we will show how these ideas can be used to construct a quantum mechanical theory with 
the desired properties. We will begin by discussing single-particle quantum mechanics and the associated theory of 
many-particle quantum mechanics. Then, by finding a way to index the degrees of freedom in the many-particle 
problem, we will construct a quantum field and study its implications. 

Single-particle and many-particle quantum mechanics 

In quantum mechanics, the time-dependent Schrodinger equation for a single particle is 



We wish to consider how this problem generalizes to 7VP ar tid es - There are two motivations for studying the 

many-particle problem. The first is a straightforward need in condensed matter physics, where typically the number 

23 

of particles is on the order of Avogadro's number (6.0221415 x 10 ). The second motivation for the many-particle 
problem arises from particle physics and the desire to incorporate the effects of special relativity. If one attempts to 
include the relativistic rest energy into the above equation (in quantum mechanics where position is an observable), 
the result is either the Klein-Gordon equation or the Dirac equation. However, these equations have many 
unsatisfactory qualities; for instance, they possess energy eigenvalues that extend to — °o, so that there seems to be no 
easy definition of a ground state. It turns out that such inconsistencies arise from relativistic wavefunctions having a 
probabilistic interpretation in position space, as probability conservation is not a relativistically covariant concept. In 
quantum field theory, unlike in quantum mechanics, position is not an observable, and thus, one does not need the 
concept of a position-space probability density. For quantum fields whose interaction can be treated perturbatively, 
this is equivalent to neglecting the possibility of dynamically creating or destroying particles, which is a crucial 
aspect of relativistic quantum theory. Einstein's famous mass-energy relation allows for the possibility that 
sufficiently massive particles can decay into several lighter particles, and sufficiently energetic particles can combine 
to form massive particles. For example, an electron and a positron can annihilate each other to create photons. This 
suggests that a consistent relativistic quantum theory should be able to describe many -particle dynamics. 

Furthermore, we will assume that the TV particles are indistinguishable. As described in the article on identical 
particles, this implies that the state of the entire system must be either symmetric (bosons) or antisymmetric 
(fermions) when the coordinates of its constituent particles are exchanged. These multi-particle states are rather 
complicated to write. For example, the general quantum state of a system of TV bosons is written as 



over all possible permutations p acting on elements. In general, this is a sum of 7V!( TV factorial) distinct 




v P eS N 

where are the single-particle states, Njis the number of particles occupying state j , and the sum is taken 




Quantum field theory 



203 



terms, which quickly becomes unmanageable as TV mcre ases. The way to simplify this problem is to turn it into a 
quantum field theory. 

Second quantization 

In this section, we will describe a method for constructing a quantum field theory called second quantization. This 

basically involves choosing a way to index the quantum mechanical degrees of freedom in the space of multiple 

identical-particle states. It is based on the Hamiltonian formulation of quantum mechanics; several other approaches 

Mi 

exist, such as the Feynman path integral, which uses a Lagrangian formulation. For an overview, see the article on 
quantization. 

Second quantization of bosons 

For simplicity, we will first discuss second quantization for bosons, which form perfectly symmetric quantum states. 
Let us denote the mutually orthogonal single-particle states by |02)j 1 03 ) , and so on. For example, the 

3-particle state with one particle in state and two in state |0 2 ) * s 

^ [\M<h)\<h) + |0 2 >|0i>|0 2 > + \<h)\<h)\<f>i)} ■ 

The first step in second quantization is to express such quantum states in terms of occupation numbers, by listing 
the number of particles occupying each of the single-particle states |02} 5 etc - This is simply another way of 

labelling the states. For instance, the above 3-particle state is denoted as 
|1,2,0,0,0,...). 

The next step is to expand the TV -particle state space to include the state spaces for all possible values of TV • This 
extended state space, known as a Fock space, is composed of the state space of a system with no particles (the 
so-called vacuum state), plus the state space of a 1 -particle system, plus the state space of a 2-particle system, and so 
forth. It is easy to see that there is a one-to-one correspondence between the occupation number representation and 
valid boson states in the Fock space. 

At this point, the quantum mechanical system has become a quantum field in the sense we described above. The 
field's elementary degrees of freedom are the occupation numbers, and each occupation number is indexed by a 
number j • • • , indicating which of the single-particle states |0i) , |02) j ■ ■ ■ " " " ^ refers to. 
The properties of this quantum field can be explored by defining creation and annihilation operators, which add and 
subtract particles. They are analogous to "ladder operators" in the quantum harmonic oscillator problem, which 
added and subtracted energy quanta. However, these operators literally create and annihilate particles of a given 
quantum state. The bosonic annihilation operator <22and creation operator a\ nave me following effects: 

oalJVi, N 2 , N 3 ,...) = Jn~ 2 \ N u (N 2 - 1), N 3 , . . .}, 
4\N U N 2 , N 3 , ...) = yfN 2 + l | N u (N 2 + 1), N 3 , . . .). 

It can be shown that these are operators in the usual quantum mechanical sense, i.e. linear operators acting on the 
Fock space. Furthermore, they are indeed Hermitian conjugates, which justifies the way we have written them. They 
can be shown to obey the commutation relation 

[a h a 5 ] = 0 , [aj, a]] = 0 , [a*, a]] = S ij: 

where S stands for the Kronecker delta. These are precisely the relations obeyed by the ladder operators for an 
infinite set of independent quantum harmonic oscillators, one for each single-particle state. Adding or removing 
bosons from each state is therefore analogous to exciting or de-exciting a quantum of energy in a harmonic 
oscillator. 

The Hamiltonian of the quantum field (which, through the Schrodinger equation, determines its dynamics) can be 
written in terms of creation and annihilation operators. For instance, the Hamiltonian of a field of free 



Quantum field theory 



204 



(non-interacting) bosons is 
k 

where E k is the energy of the -th single-particle energy eigenstate. Note that 

ata k \...,N k) ...) = N k \...,N k ,...). 
Second quantization of fermions 

It turns out that a different definition of creation and annihilation must be used for describing fermions. According to 
the Pauli exclusion principle, fermions cannot share quantum states, so their occupation numbers N{ can only take 
on the value 0 or 1. The fermionic annihilation operators c and creation operators c t are defined by their actions on 
a Fock state thus 

c j \N 1 ,N 2 ,...,N j = 0,...) = 0 

Cj \N u N 2 , ...,Nj = l,...) = (-l)( JVl+ - +JV -^|JV 1) N 2 , ...,Nj = 0,.. .} 
c}\N u N 2 , . . . , Nj = 0, . . .) = (-1)^ + -+ N ^\N U N 2 , . . . , = 1, . . .) 
4\N u N 2 ,...,N j = l,...)=0. 

These obey an anticommutation relation: 

{c i ,c j } = {) , {cj, C t} = 0 , {c i ,ct}=^-. 

One may notice from this that applying a fermionic creation operator twice gives zero, so it is impossible for the 
particles to share single-particle states, in accordance with the exclusion principle. 

Field operators 

We have previously mentioned that there can be more than one way of indexing the degrees of freedom in a quantum 
field. Second quantization indexes the field by enumerating the single-particle quantum states. However, as we have 
discussed, it is more natural to think about a "field", such as the electromagnetic field, as a set of degrees of freedom 
indexed by position. 

To this end, we can define field operators that create or destroy a particle at a particular point in space. In particle 
physics, these operators turn out to be more convenient to work with, because they make it easier to formulate 
theories that satisfy the demands of relativity. 

Single-particle states are usually enumerated in terms of their momenta (as in the particle in a box problem.) We can 
construct field operators by applying the Fourier transform to the creation and annihilation operators for these states. 
For example, the bosonic field annihilation operator 0(r)is 

3 

The bosonic field operators obey the commutation relation 

[0(r),0(r')]=O , [4>Hr),^(r')]=0 , [0(r), 0+(r')] = S 3 (r - r') 

where 8{x) stands for the Dirac delta function. As before, the fermionic relations are the same, with the 
commutators replaced by anticommutators. 

It should be emphasized that the field operator is not the same thing as a single-particle wavefunction. The former is 
an operator acting on the Fock space, and the latter is a quantum-mechanical amplitude for finding a particle in some 
position. However, they are closely related, and are indeed commonly denoted with the same symbol. If we have a 
Hamiltonian with a space representation, say 



Quantum field theory 



205 



where the indices % and j run over all particles, then the field theory Hamiltonian (in the non-relativistic limit and 
for negligible self-interactions) is 



This looks remarkably like an expression for the expectation value of the energy, with 0 playing the role of the 
wavefunction. This relationship between the field operators and wavefunctions makes it very easy to formulate field 
theories starting from space-projected Hamiltonians. 

Implications of quantum field theory 
Unification of fields and particles 

The "second quantization" procedure that we have outlined in the previous section takes a set of single-particle 
quantum states as a starting point. Sometimes, it is impossible to define such single-particle states, and one must 
proceed directly to quantum field theory. For example, a quantum theory of the electromagnetic field must be a 
quantum field theory, because it is impossible (for various reasons) to define a wavefunction for a single photon. In 
such situations, the quantum field theory can be constructed by examining the mechanical properties of the classical 
field and guessing the corresponding quantum theory. For free (non-interacting) quantum fields, the quantum field 
theories obtained in this way have the same properties as those obtained using second quantization, such as 
well-defined creation and annihilation operators obeying commutation or anticommutation relations. 

Quantum field theory thus provides a unified framework for describing "field-like" objects (such as the 
electromagnetic field, whose excitations are photons) and "particle-like" objects (such as electrons, which are treated 
as excitations of an underlying electron field), so long as one can treat interactions as "perturbations" of free fields. 
There are still unsolved problems relating to the more general case of interacting fields that may or may not be 
adequately described by perturbation theory. For more on this topic, see Haag's theorem. 

Physical meaning of particle indistinguishability 

The second quantization procedure relies crucially on the particles being identical. We would not have been able to 
construct a quantum field theory from a distinguishable many-particle system, because there would have been no 
way of separating and indexing the degrees of freedom. 

Many physicists prefer to take the converse interpretation, which is that quantum field theory explains what identical 
particles are. In ordinary quantum mechanics, there is not much theoretical motivation for using symmetric 
(bosonic) or antisymmetric (fermionic) states, and the need for such states is simply regarded as an empirical fact. 
From the point of view of quantum field theory, particles are identical if and only if they are excitations of the same 
underlying quantum field. Thus, the question "why are all electrons identical?" arises from mistakenly regarding 
individual electrons as fundamental objects, when in fact it is only the electron field that is fundamental. 

Particle conservation and non-conservation 

During second quantization, we started with a Hamiltonian and state space describing a fixed number of particles ( 
TVX an d ended with a Hamiltonian and state space for an arbitrary number of particles. Of course, in many common 
situations TV is an important and perfectly well-defined quantity, e.g. if we are describing a gas of atoms sealed in a 
box. From the point of view of quantum field theory, such situations are described by quantum states that are 
eigenstates of the number operator ]\f , which measures the total number of particles present. As with any quantum 
mechanical observable, yyis conserved if it commutes with the Hamiltonian. In that case, the quantum state is 
trapped in the TV -particle subspace of the total Fock space, and the situation could equally well be described by 
ordinary TV -particle quantum mechanics. (Strictly speaking, this is only true in the noninteracting case or in the low 
energy density limit of renormalized quantum field theories) 



H=-^- / d\ 0t(r)V 2 0(r) + d 3 r / dV ${t)<P{t?)U{\t - r'|)0(r')0(r). 





Quantum field theory 



206 



For example, we can see that the free-boson Hamiltonian described above conserves particle number. Whenever the 
Hamiltonian operates on a state, each particle destroyed by an annihilation operator a>k is immediately put back by 
the creation operator a\. • 

On the other hand, it is possible, and indeed common, to encounter quantum states that are not eigenstates of ]\f , 
which do not have well-defined particle numbers. Such states are difficult or impossible to handle using ordinary 
quantum mechanics, but they can be easily described in quantum field theory as quantum superpositions of states 
having different values of TV . For example, suppose we have a bosonic field whose particles can be created or 
destroyed by interactions with a fermionic field. The Hamiltonian of the combined system would be given by the 
Hamiltonians of the free boson and free fermion fields, plus a "potential energy" term such as 

H i = £ V q (a q + al q )c\ +q c k , 

k,q 

where a\. an d a k denotes the bosonic creation and annihilation operators, c£ an d c k denotes the fermionic 
creation and annihilation operators, and V^is a parameter that describes the strength of the interaction. This 
"interaction term" describes processes in which a fermion in state fa either absorbs or emits a boson, thereby being 
kicked into a different eigenstate k + q . (In fact, this type of Hamiltonian is used to describe interaction between 
conduction electrons and phonons in metals. The interaction between electrons and photons is treated in a similar 
way, but is a little more complicated because the role of spin must be taken into account.) One thing to notice here is 
that even if we start out with a fixed number of bosons, we will typically end up with a superposition of states with 

different numbers of bosons at later times. The number of fermions, however, is conserved in this case. 

In condensed matter physics, states with ill-defined particle numbers are particularly important for describing the 

various superfluids. Many of the defining characteristics of a superfluid arise from the notion that its quantum state is 

a superposition of states with different particle numbers. In addition, the concept of a coherent state (used to model 

the laser and the BCS ground state) refers to a state with an ill-defined particle number but a well-defined phase. 

Axiomatic approaches 

The preceding description of quantum field theory follows the spirit in which most physicists approach the subject. 
However, it is not mathematically rigorous. Over the past several decades, there have been many attempts to put 
quantum field theory on a firm mathematical footing by formulating a set of axioms for it. These attempts fall into 
two broad classes. 

The first class of axioms, first proposed during the 1950s, include the Wightman, Osterwalder-Schrader, and 
Haag-Kastler systems. They attempted to formalize the physicists' notion of an "operator- valued field" within the 
context of functional analysis, and enjoyed limited success. It was possible to prove that any quantum field theory 
satisfying these axioms satisfied certain general theorems, such as the spin- statistics theorem and the CPT theorem. 
Unfortunately, it proved extraordinarily difficult to show that any realistic field theory, including the Standard 
Model, satisfied these axioms. Most of the theories that could be treated with these analytic axioms were physically 
trivial, being restricted to low-dimensions and lacking interesting dynamics. The construction of theories satisfying 
one of these sets of axioms falls in the field of constructive quantum field theory. Important work was done in this 
area in the 1970s by Segal, Glimm, Jaffe and others. 

During the 1980s, a second set of axioms based on geometric ideas was proposed. This line of investigation, which 
restricts its attention to a particular class of quantum field theories known as topological quantum field theories, is 
associated most closely with Michael Atiyah and Graeme Segal, and was notably expanded upon by Edward Witten, 
Richard Borcherds, and Maxim Kontsevich. However, most of the physically relevant quantum field theories, such 
as the Standard Model, are not topological quantum field theories; the quantum field theory of the fractional 
quantum Hall effect is a notable exception. The main impact of axiomatic topological quantum field theory has been 
on mathematics, with important applications in representation theory, algebraic topology, and differential geometry. 



Quantum field theory 



207 



Finding the proper axioms for quantum field theory is still an open and difficult problem in mathematics. One of the 
Millennium Prize Problems — proving the existence of a mass gap in Yang-Mills theory — is linked to this issue. 

Phenomena associated with quantum field theory 

In the previous part of the article, we described the most general properties of quantum field theories. Some of the 
quantum field theories studied in various fields of theoretical physics possess additional special properties, such as 
renormalizability, gauge symmetry, and super symmetry. These are described in the following sections. 

Renormalization 

Early in the history of quantum field theory, it was found that many seemingly innocuous calculations, such as the 
perturbative shift in the energy of an electron due to the presence of the electromagnetic field, give infinite results. 
The reason is that the perturbation theory for the shift in an energy involves a sum over all other energy levels, and 
there are infinitely many levels at short distances that each give a finite contribution. 

Many of these problems are related to failures in classical electrodynamics that were identified but unsolved in the 
19th century, and they basically stem from the fact that many of the supposedly "intrinsic" properties of an electron 
are tied to the electromagnetic field that it carries around with it. The energy carried by a single electron — its self 
energy — is not simply the bare value, but also includes the energy contained in its electromagnetic field, its 
attendant cloud of photons. The energy in a field of a spherical source diverges in both classical and quantum 
mechanics, but as discovered by Weisskopf, in quantum mechanics the divergence is much milder, going only as the 
logarithm of the radius of the sphere. 

The solution to the problem, presciently suggested by Stueckelberg, independently by Bethe after the crucial 
experiment by Lamb, implemented at one loop by Schwinger, and systematically extended to all loops by Feynman 
and Dyson, with converging work by Tomonaga in isolated postwar Japan, is called renormalization. The technique 
of renormalization recognizes that the problem is essentially purely mathematical, that extremely short distances are 
at fault. In order to define a theory on a continuum, first place a cutoff on the fields, by postulating that quanta 
cannot have energies above some extremely high value. This has the effect of replacing continuous space by a 
structure where very short wavelengths do not exist, as on a lattice. Lattices break rotational symmetry, and one of 
the crucial contributions made by Feynman, Pauli and Villars, and modernized by 't Hooft and Veltman, is a 
symmetry preserving cutoff for perturbation theory. There is no known symmetrical cutoff outside of perturbation 
theory, so for rigorous or numerical work people often use an actual lattice. 

On a lattice, every quantity is finite but depends on the spacing. When taking the limit of zero spacing, we make sure 
that the physically observable quantities like the observed electron mass stay fixed, which means that the constants 
in the Lagrangian defining the theory depend on the spacing. Hopefully, by allowing the constants to vary with the 
lattice spacing, all the results at long distances become insensitive to the lattice, defining a continuum limit. 

The renormalization procedure only works for a certain class of quantum field theories, called renormalizable 
quantum field theories. A theory is perturbatively renormalizable when the constants in the Lagrangian only 
diverge at worst as logarithms of the lattice spacing for very short spacings. The continuum limit is then well defined 
in perturbation theory, and even if it is not fully well defined non-perturbatively, the problems only show up at 
distance scales that are exponentially small in the inverse coupling for weak couplings. The Standard Model of 
particle physics is perturbatively renormalizable, and so are its component theories (quantum 
electrodynamics/electro weak theory and quantum chromodynamics). Of the three components, quantum 
electrodynamics is believed to not have a continuum limit, while the asymptotically free SU(2) and SU(3) weak 
hypercharge and strong color interactions are nonperturbatively well defined. 

The renormalization group describes how renormalizable theories emerge as the long distance low-energy effective 
field theory for any given high-energy theory. Because of this, renormalizable theories are insensitive to the precise 



Quantum field theory 



208 



nature of the underlying high-energy short-distance phenomena. This is a blessing because it allows physicists to 
formulate low energy theories without knowing the details of high energy phenomenon. It is also a curse, because 
once a renormalizable theory like the standard model is found to work, it gives very few clues to higher energy 
processes. The only way high energy processes can be seen in the standard model is when they allow otherwise 
forbidden events, or if they predict quantitative relations between the coupling constants. 

Gauge freedom 

A gauge theory is a theory that admits a symmetry with a local parameter. For example, in every quantum theory the 
global phase of the wave function is arbitrary and does not represent something physical. Consequently, the theory is 
invariant under a global change of phases (adding a constant to the phase of all wave functions, everywhere); this is a 
global symmetry. In quantum electrodynamics, the theory is also invariant under a local change of phase, that is - 
one may shift the phase of all wave functions so that the shift may be different at every point in space-time. This is a 
local symmetry. However, in order for a well-defined derivative operator to exist, one must introduce a new field, 
the gauge field, which also transforms in order for the local change of variables (the phase in our example) not to 
affect the derivative. In quantum electrodynamics this gauge field is the electromagnetic field. The change of local 
gauge of variables is termed gauge transformation. 

In quantum field theory the excitations of fields represent particles. The particle associated with excitations of the 
gauge field is the gauge boson, which is the photon in the case of quantum electrodynamics. 

The degrees of freedom in quantum field theory are local fluctuations of the fields. The existence of a gauge 
symmetry reduces the number of degrees of freedom, simply because some fluctuations of the fields can be 
transformed to zero by gauge transformations, so they are equivalent to having no fluctuations at all, and they 
therefore have no physical meaning. Such fluctuations are usually called "non-physical degrees of freedom" or gauge 
artifacts; usually some of them have a negative norm, making them inadequate for a consistent theory. Therefore, if 
a classical field theory has a gauge symmetry, then its quantized version (i.e. the corresponding quantum field 
theory) will have this symmetry as well. In other words, a gauge symmetry cannot have a quantum anomaly. If a 
gauge symmetry is anomalous (i.e. not kept in the quantum theory) then the theory is non-consistent: for example, in 
quantum electrodynamics, had there been a gauge anomaly, this would require the appearance of photons with 
longitudinal polarization and polarization in the time direction, the latter having a negative norm, rendering the 
theory inconsistent; another possibility would be for these photons to appear only in intermediate processes but not 
in the final products of any interaction, making the theory non unitary and again inconsistent (see optical theorem). 

In general, the gauge transformations of a theory consist of several different transformations, which may not be 
commutative. These transformations are together described by a mathematical object known as a gauge group. 
Infinitesimal gauge transformations are the gauge group generators. Therefore the number of gauge bosons is the 
group dimension (i.e. number of generators forming a basis). 

All the fundamental interactions in nature are described by gauge theories. These are: 

• Quantum electrodynamics, whose gauge transformation is a local change of phase, so that the gauge group is 
U(l). The gauge boson is the photon. 

• Quantum chromodynamics, whose gauge group is SU(3). The gauge bosons are eight gluons. 

• The electroweak Theory, whose gauge group is U(l) ® SU (2) (a direct product of U(l) and SU(2)). 

• Gravity, whose classical theory is general relativity, admits the equivalence principle, which is a form of gauge 
symmetry. However, it is explicitly non-renormalizable. 



Quantum field theory 



209 



Multivalued Gauge Transformations 

The gauge transformations which leave the theory invariant involve by definition only single- valued gauge functions 



functions which violate the integrability criterion. These are capable of changing the physical field strengths and are 
therefore no proper symmetry transformations. Nevertheless, the transformed field equations describe correctly the 
physical laws in the presence of the newly generated field strengths. See the textbook by H. Kleinert cited below for 
the applications to phenomena in physics. 

Supersymmetry 

Supersymmetry assumes that every fundamental fermion has a superpartner that is a boson and vice versa. It was 
introduced in order to solve the so-called Hierarchy Problem, that is, to explain why particles not protected by any 
symmetry (like the Higgs boson) do not receive radiative corrections to its mass driving it to the larger scales (GUT, 
Planck...). It was soon realized that supersymmetry has other interesting properties: its gauged version is an 
extension of general relativity (Supergravity), and it is a key ingredient for the consistency of string theory. 

The way supersymmetry protects the hierarchies is the following: since for every particle there is a superpartner with 
the same mass, any loop in a radiative correction is cancelled by the loop corresponding to its superpartner, 
rendering the theory UV finite. 

Since no superpartners have yet been observed, if supersymmetry exists it must be broken (through a so-called soft 
term, which breaks supersymmetry without ruining its helpful features). The simplest models of this breaking require 
that the energy of the superpartners not be too high; in these cases, supersymmetry is expected to be observed by 
experiments at the Large Hadron Collider. 




See also 



• List of quantum field theories 

• Constructive quantum field theory 

• Feynman path integral 

• Quantum chromodynamics 

• Quantum electrodynamics 

• Quantum flavordynamics 

• Quantum geometrodynamics 

• Quantum hydrodynamics 

• Quantum magnetodynamics 

• Quantum triviality 

• Schwinger-Dyson equation 

• Einstein-Maxwell-Dirac equations 

• Relation between Schrodinger's equation and the path integral formulation of 



Abraham-Lorentz force 



Green's function (many-body theory) 

Common integrals in quantum field theory 

Wheeler-Feynman absorber theory 

Wigner's theorem 

Wigner's classification 

Static forces and virtual-particle exchange 



Photon polarization 

Theoretical and experimental justification for the 



Schrodinger equation 
Invariance mechanics 



Form factor 



Green-Kubo relations 



quantum mechanics 

• Basic concepts of quantum mechanics 

• Relationship between string theory and quantum field theory 



Quantum field theory 



210 



Notes 

[1] Weinberg, S. Quantum Field Theory, Vols. I to III, 2000, Cambridge University Press: Cambridge, UK. 
[2] (http :// physics . princeton.edu/ -mcdonald/ examples/ QM/ thorn_aj p_72_ 121 0_04 . pdf) 

[3] Dirac, P.A.M. (1927). The Quantum Theory of the Emission and Absorption of Radiation, Proceedings of the Royal Society of London, Series 
A, Vol. 114, p. 243. 

[4] Abraham Pais, Inward Bound: Of Matter and Forces in the Physical World ISBN 0-19-851997-4. Pais recounts how his astonishment at the 
rapidity with which Feynman could calculate using his method. Feynman's method is now part of the standard methods for physicists. 

Further reading 

General readers: 

• Feynman, R.P. (2001) [1964]. The Character of Physical Law. MIT Press. ISBN 0262560038. 

• Feynman, R.P. (2006) [1985]. QED: The Strange Theory of Light and Matter. Princeton University Press. 
ISBN 0691125759. 

• Gribbin, J. (1998). Q is for Quantum: Particle Physics from A to Z. Weidenfeld & Nicolson. ISBN 0297817523. 

• Schumm, Bruce A. (2004) Deep Down Things. Johns Hopkins Univ. Press. Chpt. 4. 

Introductory texts: 

• Bogoliubov, N.; Shirkov, D. (1982). Quantum Fields. Benjamin-Cummings. ISBN 0805309837. 

• Frampton, P.H. (2000). Gauge Field Theories. Frontiers in Physics (2nd ed.). Wiley. 

• Greiner, W; Muller, B. (2000). Gauge Theory of Weak Interactions. Springer. ISBN 3-540-67672-4. 

• Itzykson, C; Zuber, J.-B. (1980). Quantum Field Theory. McGraw-Hill. ISBN 0-07-032071-3. 

• Kane, G.L. (1987). Modern Elementary Particle Physics. Perseus Books. ISBN 0-201-1 1749-5. 

• Kleinert, H.; Schulte-Frohlinde, Verena (2001). Critical Properties of qf -Theories (http://users.physik. 
fu-berlin.de/~kleinert/re.html#B6). World Scientific. ISBN 981-02-4658-7. 

• Kleinert, H. (2008). Multivalued Fields in Condensed Matter, Electrodynamics, and Gravitation (http://users. 
physik.fu-berlin.de/~kleinert/public_html/kleiner_rebl 1/psfiles/mvf. pdf). World Scientific. 

ISBN 978-981-279-170-2. 

• Loudon, R (1983). The Quantum Theory of Light. Oxford University Press. ISBN 0-19-851 155-8. 

• Mandl, F.; Shaw, G. (1993). Quantum Field Theory. John Wiley & Sons. ISBN 0-0471-94186-7. 

• Peskin, M.; Schroeder, D. (1995). An Introduction to Quantum Field Theory. Westview Press. 
ISBN 0-201-50397-2. 

• Ryder, L.H. (1985). Quantum Field Theory. Cambridge University Press. ISBN 0-521-33859-X. 

• Srednicki, Mark (2007) Quantum Field Theory, (http://www.cambridge.org/us/catalogue/catalogue. 
asp?isbn=052 1864496) Cambridge Univ. Press. 

• Yndurain, F.J. (1996). Relativistic Quantum Mechanics and Introduction to Field Theory (1st ed.). Springer. 
ISBN 978-3540604532. 

• Zee, A. (2003). Quantum Field Theory in a Nutshell. Princeton University Press. ISBN ISBN 0-691-01019-6. 
Advanced texts: 

• Bogoliubov, N.; Logunov, A. A.; Oksak, A.I.; Todorov, I.T. (1990). General Principles of Quantum Field Theory. 
Kluwer Academic Publishers. ISBN 978-0792305408. 

• Weinberg, S. (1995). The Quantum Theory of Fields. 1-3. Cambridge University Press. 

Articles: 

• Gerard 't Hooft (2007) " The Conceptual Basis of Quantum Field Theory (http://www.phys.uu.nl/~thooft/ 
lectures/basisqft.pdf)" in Butterfield, J., and John Earman, eds., Philosophy of Physics, Part A. Elsevier: 
661-730. 

• Frank Wilczek (1999) " Quantum field theory, (http://arxiv.org/abs/hep-th/9803075)" Reviews of Modern 
Physics 71: S83-S95. Also doi=10.1103/Rev. Mod. Phys. 71 . 



Quantum field theory 



211 



External links 

• Stanford Encyclopedia of Philosophy: " Quantum Field Theory, (http://plato.stanford.edu/entries/ 
quantum-field-theory/)" by Meinard Kuhlmann. 

• Siegel, Warren, 2005. Fields, (http://insti.physics.sunysb.edu/~siegel/errata.html) A free text, also available 
from arXiv:hep-th/9912205. 

• Pedagogic Aids to Quantum Field Theory (http://quantumfieldtheory.info). Click on "Introduction" for a 
simplified introduction suitable for someone familiar with quantum mechanics. 

• Free condensed matter books and notes (http://www.freebookcentre.net/Physics/Condensed-Matter-Books. 
html). 

• Quantum field theory texts (http://motls.blogspot.com/2006/01/qft-didactics.html), a list with links to 
amazon.com. 

• Quantum Field Theory (http://www.nat.vu.nl/~mulders/QFT-0.pdf) by P. J. Mulders 

• Quantum Field Theory (http://damtp.cam.ac.uk/user/tong/qft/qft.pdf) by David Tong 

• Quantum Field Theory Video Lectures (http://pirsa.org/index.php ?p=speaker&name=David_Tong) by David 
Tong 

• Quantum Field Theory Lecture Notes (http://www2.physics.utoronto.ca/~luke/PHY2403/References_files/ 
lecturenotes.pdf) by Michael Luke 

• Quantum Field Theory Video Lectures (http://www.physics.harvard.edu/about/Phys253.html) by Sidney R. 
Coleman 

• Quantum Field Theory Lecture Notes (http://www.physics.gla.ac.uk/~drniller/lectures/RQF_l-6_2010.pdf) 
by D.J. Miller 

• Quantum Field Theory Lecture Notes II (http://www.physics.gla.ac.uk/~dmiller/lectures/RQF_7-9_2010. 
pdf) by D.J. Miller 

Scalar field theory 

In theoretical physics, scalar field theory can refer to a classical or quantum theory of scalar fields. A field which is 
invariant under any Lorentz transformation is called a "scalar", in contrast to a vector or tensor field. The quanta of 
the quantized scalar field are spin-zero particles, and as such are bosons. 

No fundamental scalar fields have been observed in nature, though the Higgs boson may yet prove the first example. 
However, scalar fields appear in the effective field theory descriptions of many physical phenomena. An example is 
the pion, which is actually a "pseudoscalar", which means it is not invariant under parity transformations which 
invert the spatial directions, distinguishing it from a true scalar, which is parity-invariant. Because of the relative 
simplicity of the mathematics involved, scalar fields are often the first field introduced to a student of classical or 
quantum field theory. 

In this article, the repeated index notation indicates the Einstein summation convention for summation over repeated 
indices. The theories described are defined in flat, D-dimensional Minkowski space, with (D-l) spatial dimension 
and one time dimension and are, by construction, relativistically covariant. The Minkowski space metric, , has a 
particularly simple form: it is diagonal, and here we use the + sign convention. 



Scalar field theory 



212 



Classical scalar field theory 
Linear (free) theory 

The most basic scalar field theory is the linear theory. The action for the free relativistic scalar field theory is 

1 _ . 1 



S = J d^xdtC = J d D ~ l xdt 



/ 



= / d^xdf 



where £ is known as a Lagrangian density. This is an example of a quadratic action, since each of the terms is 
quadratic in the field, (j) . The term proportional to m 2 is sometimes known as a mass term, due to its interpretation 
in the quantized version of this theory in terms of particle mass. 

The equation of motion for this theory is obtained by extremizing the action above. It takes the following form, 
linear in 0 : 

rTd^ + ™*0 = dt<t> ~ V 2 0 + m 2 4> = 0 
Note that this is the same as the Klein-Gordon equation, but that here the interpretation is as a classical field 
equation, rather than as a quantum mechanical wave equation. 



Nonlinear (interacting) theory 

The most common generalization of the linear theory above is to add a scalar potential V^(0)to the equations of 
motion, where typically, V is a polynomial in cp of order 3 or more (often a monomial). Such a theory is sometimes 
said to be interacting, because the Euler-Lagrange equation is now is nonlinear, implying a self-interaction. The 
action for the most general such theory is 



S = J d D ~ 1 xdt£ = J d D ~ l xdt 



/I 1 1 * 1 

d^sdf -(drf) 2 - - -m 2 cf> 2 - J2 ~ y 9^ 

71=3 



The n\ factors in the expansion are introduced because they are useful in the Feynman diagram expansion of the 
quantum theory, as described below. The corresponding Euler-Lagrange equation of motion is 

rTd^ + V\<t)) = d\<$> - V 2 0 + V'{tf>) = 0. 
Dimensional analysis and scaling 

Physical quantities in these scalar field theories may have dimensions of length, time or mass, or some combination 
of the three. However, in a relativistic theory, any quantity t, with dimensions of time, can be 'converted' into a 
length, I = ct i by using the velocity of light, c. 

h 

Similarly, any length / is equivalent to an inverse mass, [ = , using Planck's constant, ft . Heuristically, one 

mc 

can think of a time as a length, or either time or length as an inverse mass. In short, one can think of the dimensions 
of any physical quantity as defined in terms of just one independent dimension, rather than in terms of all three. This 
is most often termed the mass dimension of the quantity. 

One objection is that this theory is classical, and therefore it is not obvious that Planck's constant should be a part of 
the theory at all. In a sense this is a valid objection, and if desired one can indeed recast the theory without mass 
dimensions at all. However, this would be at the expense of making the connection with the quantum scalar field 
slightly more obscure. Given that one has dimensions of mass, Planck's constant is thought of here as an essentially 
arbitrary fixed quantity with dimensions appropriate to convert between mass and inverse length. This is consistent 



Scalar field theory 



213 



with the Feynman path integral approach to quantization, where the only reason for Planck's constant to appear stems 
from the same type of dimensional argument, since the action must be divided by some parameter with these 
dimensions to render the phase dimensionless. 

Scaling Dimension 

The classical scaling dimension, or mass dimension, A , of 0 describes the transformation of the field under a 
rescaling of coordinates: 

x — > Xx 

<j> \~ A <p 

The units of action are the same as the units of fa , and so the action itself has zero mass dimension. This fixes the 
scaling dimension of 0 to be 

Scale invariance 

There is a specific sense in which some scalar field theories are scale-invariant. While the actions above are all 
constructed to have zero mass dimension, not all actions are invariant under the scaling transformation 

x — > Xx 

The reason that not all actions are invariant is that one usually thinks of the parameters m and g n as fixed quantities, 
which are not rescaled under the transformation above. The condition for a scalar field theory to be scale invariant is 
then quite obvious: all of the parameters appearing in the action should be dimensionless quantities. In other words, a 
scale invariant theory is one without any fixed length scale (or equivalently, mass scale) in the theory. 

2D 



For a scalar field theory with D spacetime dimensions, the only dimensionless parameter g n satisfies n = ^_ ^ 

. For example, in D=4 only 54 is classically dimensionless, and so the only classically scale-invariant scalar field 
theory in £) = 4is the massless 0 4 theory. Classical scale invariance normally does not imply quantum scale 

invariance. See the discussion of the beta function below. 

Conformal invariance 

A transformation 

x —> x{x) 

is said to be conformal if the transformation satisfies 

for some function A 2 [x) • The conformal group contains as subgroups the isometries of the metric (the 

Poincare group) and also the scaling transformations (or dilatations) considered above. In fact, the scale-invariant 
theories in the previous section are also conformally-invariant. 



Scalar field theory 



214 



<p 4 theory 

Massive 0 4 theory illustrates a number of interesting phenomena in scalar field theory. 
The Lagrangian density is 

Spontaneous symmetry breaking 

This Lagrangian has a Z2 symmetry under the transformation (j) — > —(f) 

This is an example of an internal symmetry, in contrast to a space-time symmetry. 

If TT^is positive, the potential V((f)) = — m 2 0 2 + — 0 4 has a single minimum, at the origin. The solution 

2 4! 

(j) = Ois clearly invariant under the Z2 symmetry. Conversely, if m 2 is negative, then one can readily see that the 
potential V((b) = —m 2 (t> 2 + — (£ 4 has two minima. This is known as a double well potential, and the lowest 

2 4! 

energy states (known as the vacua, in quantum field theoretical language) in such a theory are not invariant under the 
Z2 symmetry of the action (in fact it maps each of the two vacua into the other). In this case, the Z2 symmetry is 
said to be spontaneously broken. 

Kink solutions 

The 0 4 theory with a negative m 2 also has a kink solution, which is a canonical example of a soliton. Such a 
solution is of the form 

f / . x m , (mix — Xq)\ 

where x is one of the spatial variables ( 0 is taken to be independent of t, and the remaining spatial variables). The 
solution interpolates between the two different vacua of the double well potential. It is not possible to deform the 
kink into a constant solution without passing through a solution of infinite energy, and for this reason the kink is said 
to be stable. For ]J > 2 , i.e. theories with more than one spatial dimension, this solution is called a domain wall. 
Another well-known example of a scalar field theory with kink solutions is the sine-Gordon theory. 

Complex scalar field theory 

In a complex scalar field theory, the scalar field takes values in the complex numbers, rather than the real numbers. 
The action considered normally takes the form 



S = J dP^xdtC = J d^xdt [rrWdvt ~ ^(I0| 2 )] 



This has a U(l) symmetry, whose action on the space of fields rotates 0 — > e %OL (j) , for some real phase angle ot . 

2 

As for the real scalar field, spontaneous symmetry breaking is found if m is negative. This gives rise to a Mexican 
hat potential which is analogous to the double-well potential in real scalar field theory, but now the choice of 
vacuum breaks a continuous U(l) symmetry instead of a discrete one. This leads to a Goldstone boson. 

0(N) theory 

One can express the complex scalar field theory in terms of two real fields, (j) 1 = Re(j) and (j) 2 = Im(j) which 
transform in the vector representation of the U(l) = O (2) internal symmetry. Although such fields transform as a 
vector under the internal symmetry, they are still Lorentz scalars. This can be generalised to a theory of N scalar 
fields transforming in the vector representation of the 0(N) symmetry. The Lagrangian for an O (N) -invariant scalar 
field theory is typically of the form 



Scalar field theory 



215 



£ = \rTW ' 14 ~ H<t> ' <l>) 
using an appropriate O(N) -invariant inner product. 

Quantum scalar field theory 

In quantum field theory, the fields, and all observables constructed from them, are replaced by quantum operators on 
a Hilbert space. This Hilbert space is built on a vacuum state, and dynamics are governed by a Hamiltonian, a 
positive operator which annihilates the vacuum. A construction of a quantum scalar field theory may be found in the 
canonical quantization article, which uses canonical commutation relations among the fields as a basis for the 
construction. In brief, the basic variables are the field cp and its canonical momentum jt. Both fields are Hermitian . 
At spatial points x, y at equal times, the canonical commutation relations are given by 

[(j>{x),(f>{y)] = [7r(x),7r(y)] = 0, [0(f), 7r(y)] = i6(x-y), 
and the free Hamiltonian is 



H = Jd 3 x 

A spatial Fourier transform leads to a momentum space fields 

0(jfc) = J d 3 xe~ its (l)(x), 7r(fc) = J tPie^' 3 ^ 
which are used to define annihilation and creation operators 

o(jfe) = (E<j>(k) + iir(jk)) , a + (jfc) = (E$(k) - m{k)) , 
where — \/jt 2 -|- m? • These operators satisfy the commutation relations 

[a(fci), a(k 2 )] = [a + (fci), a\k 2 )] = 0, [aft), <J(k 2 )} = (2 7 r) 3 2^(fc 1 - fc 2 ). 

The state I0> annihilated by all of the operators a is identified as the bare vacuum, and a particle with momentum 
is created by applying a^(k)^° me vacuum. Applying all possible combinations of creation operators to the vacuum 
constructs the Hilbert space. This construction is called Fock space. The vacuum is annihilated by the Hamiltonian 

where the zero-point energy has been removed by Wick-ordering. (See canonical quantization.) 

Interactions can be included by adding an interaction Hamiltonian. For a cp 4 theory, this corresponds to adding a 
Wick-ordered term g\cp A \IA\ to the Hamiltonian, and integrating over x. Scattering amplitudes may be calculated from 
this Hamiltonian in the interaction picture. These are constructed in perturbation theory by means of the Dyson 
series, which gives the time-ordered products, or ^-particle Green's functions (0|T \(j){xi) • • • 0(x n )}|O) as 
described in the Dyson series article. The Green's functions may also be obtained from a generating function that is 
constructed as a solution to the Schwinger-Dyson equation. 



Scalar field theory 



216 



Feynman Path Integral 

The Feynman diagram expansion may be obtained also from the Feynman path integral formulation.^ The time 
ordered vacuum expectation values of polynomials in cp, known as the ^-particle Green's functions, are constructed 
by integrating over all possible fields, normalized by the vacuum expectation value with no external fields, 

(O|T{0(x a ) • • • <f>{x n )}\0) = J V<Wi) W*nJe 

All of these Green's functions may be obtained by expanding the exponential in J(x)cp(x) in the generating function 

Z\J\ = [ v ^^*^^-^-l^+ J *) = z r 0 i y ho\T{<f>( Xl ) • ■ -0(x„)}|O). 

A Wick rotation may be applied to make time imaginary. Changing the signature to (++++) then turns the Feynman 
integral into a statistical mechanics partition function in Euclidean space, 



Normally, this is applied to the scattering of particles with fixed momenta, in which case, a Fourier transform is 
useful, giving instead 



z[j] = J v^-sM^+^+t^-^). 



The standard trick to evaluate this functional integral is to write it as a product of exponential factors, schematically, 

-(^+mV/2 e -^ 4 /4! e -^' 



Z[J\~ J ' V4>H[e-b 2+ 



P 

The second two exponential factors can be expanded as power series, and the combinatorics of this expansion can be 
represented graphically. The integral with X = 0 can be treated as a product of infinitely many elementary Gaussian 
integrals, and the result may be expressed as a sum of Feynman diagrams, calculated using the following Feynman 
rules: 

• Each field in the n-point Euclidean Green's function is represented by an external line (half-edge) in the 

graph, and associated with momentum p. 

• Each vertex is represented by a factor -g. 

k 

• At a given order g , all diagrams with n external lines and k vertices are constructed such that the momenta 

2 2 

flowing into each vertex is zero. Each internal line is represented by a propagator l/(q + m ), where q is the 
momentum flowing through that line. 

• Any unconstrained momenta are integrated over all values. 

• The result is divided by a symmetry factor, which is the number of ways the lines and vertices of the graph can be 
rearranged without changing its connectivity. 

• Do not include graphs containing "vacuum bubbles", connected subgraphs with no external lines. 

The last rule takes into account the effect of dividing by Z[0] • The Minkowski-space Feynman rules are similar, 

2 2 

except that each vertex is represented by -ig, while each internal line is represented by a propagator i/(q -m + ie), 
where the 'e term represents the small Wick rotation needed to make the Minkowski-space Gaussian integral 
converge. 



Scalar field theory 



217 



Renormalization 

The integrals over unconstrained momenta, called "loop integrals", in the Feynman graphs typically diverge. This is 

normally handled by renormalization, which is a procedure of adding divergent counter-terms to the Lagrangian in 

T21 

such a way that the diagrams constructed from the original Lagrangian and counter-terms is finite. A 
renormalization scale must be introduced in the process, and the coupling constant and mass become dependent upon 
it. 

The dependence of a coupling constant g on the scale X is encoded by a beta function, (3(g), defined by the relation 

This dependence on the energy scale is known as the running of the coupling parameter, and theory of this kind of 
scale-dependence in quantum field theory is described by the renormalization group. 

Beta-functions are usually computed in an approximation scheme, most commonly perturbation theory, where one 
assumes that the coupling constant is small. One can then make an expansion in powers of the coupling parameters 
and truncate the higher-order terms (also known as higher loop contributions, due to the number of loops in the 
corresponding Feynman graphs). 

The beta- function at one loop (the first perturbative contribution) for the 0 4 theory is 

The fact that the sign in front of the lowest-order term is positive suggests that the coupling constant increases with 
energy. If this behavior persists at large couplings, this would indicate the presence of a Landau pole at finite energy, 
or quantum triviality. The question can only be answered non-perturbatively, since it involves strong coupling. 

A quantum field theory is trivial when the running coupling, computed through its beta function, goes to zero when 
the cutoff is removed. Consequently, the propagator becomes that of a free particle and the field is no longer 
interacting. Alternatively, the field theory may be interpreted as an effective theory, in which the cutoff is not 
removed, giving finite interactions but leading to a Landau pole at some energy scale. For a cp 4 interaction, Michael 
Aizenman proved that the theory is indeed trivial for space-time dimension D > 5.^ For f) = 4 the triviality 
has yet to be proven rigorously, but lattice computations have confirmed this. (See Landau pole for details and 
references.) This fact is relevant as the Higgs field, for which triviality bounds are used to set limits on the Higgs 
mass, based on the new physics must enter at a higher scale (perhaps the Planck scale) to prevent the Landau pole 
from being reached. 

References 

[1] A general reference for this section is Ramond, Pierre (2001-12-21). Field Theory: A Modern Primer (Second Edition). USA: Westview 
Press. ISBN 0201304503.. 

[2] See the previous reference, or for more detail, Itzykson, Zuber; Zuber, Jean-Bernard (2006-02-24). Quantum Field Theory. Dover. 
ISBN 0070320713.. 

[3] Aizenman, M. (1981). "Proof of the Triviality of (p ^ Field Theory and Some Mean-Field Features of Ising Models for d>4". Physical 
Review Letters 47: 1-4. doi:10.1103/PhysRevLett.47.1. 



Scalar field theory 



218 



Further reading 

• Peskin, M and Schroeder, D. \An Introduction to Quantum Field Theory, Westview Press (1995) 

• Weinberg, Steven ; The Quantum Theory of Fields, (3 volumes) Cambridge University Press (1995) 

• Srednicki, Mark; Quantum Field Theory, Cambridge University Press (2007) 

• Zinn- Justin, Jean ; Quantum Field Theory and Critical Phenomena, Oxford University Press (2002) 

External links 

• Pedagogic Aides to Quantum Field Theory (http://www.quantumfieldtheory.info) Click on the link for Chap. 3 
to find an extensive, simplified introduction to scalars in relativistic quantum mechanics and quantum field 
theory. 

• 't Hooft, G., "The Conceptual Basis of Quantum Field Theory" ( online version (http://www.phys.uu.nl/ 
-thooft/lectures/basisqft.pdf)). 

Yang-Mills theory 

Yang-Mills theory is a gauge theory based on the SU(N) group. Wolfgang Pauli formulated in 1953 the first 
consistent generalization of the five-dimensional theory of Kaluza, Klein, Fock and others to a higher dimensional 
internal space. ^ Because Pauli saw no way to give masses to the gauge bosons, he refrained from publishing his 
results formally J 1 ^ 

Although Pauli did not publish this theory, he gave talks widely attended by physicists of the time. In early 1954, 
Yang and Mills developed a modern formulation in an effort to extend the original concept of gauge theory for 
abelian groups, e.g. quantum electrodynamics, to nonabelian groups to provide an explanation for strong 
interactions. This initial idea was not a success, since the quanta of the Yang-Mills field must be massless in order to 
maintain gauge invariance. The massless particles should have long range effects, but these effects are not seen in 
experiments. The idea was set aside until 1960, when the concept of particles acquiring mass through symmetry 
breaking in massless theories was put forward, initially by Jeffrey Goldstone, Yoichiro Nambu, and Giovanni 
Jona-Lasinio. 

This prompted a significant restart of Yang-Mills theory studies that proved successful in the formulation of both 
electro weak unification and quantum chromodynamics (QCD). The electro weak interaction is described by 
SU(2)xU(l) group while QCD is an SU(3) gauge theory. The electro weak theory is obtained by combining SU(2) 
with U(l), where quantum electrodynamics (QED) is described by a U(l) group, and is replaced in the unified 
electro weak theory by a U(l) group representing a weak hypercharge rather than electric charge. The massless 
bosons from the SU(2)xU(l) theory mix after spontaneous symmetry breaking to produce the 3 massive weak 
bosons, and the photon field. The Standard Model combines the strong interaction, with the unified electroweak 
interaction (unifying the weak and electromagnetic interaction) through the symmetry group SU(2)xU(l)xSU(3). In 
the current epoch the strong interaction is not unified with the electroweak interaction, but from the observed 
running of the coupling constants it is believed they all converge to a single value at very high energies. 

Phenomenology at lower energies in quantum chromodynamics is not completely understood due to the difficulties 
of managing such a theory with a strong coupling. This is the reason confinement has not been theoretically proven, 
though it is a consistent experimental observation. Proof that QCD confines at low energy is a mathematical problem 
of great relevance, and an award has been proposed by the Clay Mathematics Institute for whoever is able to show 
that the Yang-Mills theory has a mass gap. 



Yang-Mills theory 



219 



Mathematical overview 

Yang-Mills theories are a special example of gauge theory with symmetry non-abelian group given by the 
Lagrangian 

£, g , = -h T r(F 2 ) = -^ a F^ 

with the generators of the Lie algebra corresponding to the F-quantities (the curvature or field- strength form) 
satisfying 

[T a ,T b ] =if abc T c 

and the covariant derivative defined as 

Dp = Idp - i 9 T a Al 

where J is the identity for the group generators, A a ^ is the vector potential, and Q is the coupling constant. In four 

dimensions, the coupling constant Q is a pure number and for a SU(N) group one has a, 6, c = 1 . . . TV 2 — 1. 
The relation 

F; v = dpA* v - d u A; + gf^A^Al 

can be derived by the commutator 

The field has the property of being self-interacting and equations of motion that one obtains are said to be semilinear, 
as nonlinearities are both with and without derivatives. This means that one can manage this theory only by 
perturbation theory, with small nonlinearities. 

Note that the transition between "upper" ("contravariant") and "lower" ("covariant") vector or tensor components is 
trivial for a indices (e.g. f abc = f abc ), whereas for \i and v it is nontrivial, corresponding e.g. to the usual Lorentz 

signature, rj^ = diag (H ) . 

From the given Lagrangian one can derive the equations of motion given by 

d^F^ + gf abc A" b F^ = ^ 

Putting Ffu, = T a F^^ these can be rewritten as 

{D^F^f = 0. 

A Bianchi identity holds 

{D^Y + {DnF^Y + (D v F^) a = 0. 

A source enters into the equations of motion as 

d»Fz v + g r bc A» b F; v = -r v . 

Note that the currents must properly change under gauge group transformations. 



Quantization of Yang-Mills theory 

The most appropriate method to quantize the Yang-Mills theory is by functional methods, i.e. path integrals. One 
introduces a generating functional for n-point functions as 

Z ^ = J |^4je _ 4 / ^ 4a; Tr( J F ,//I/ i^ Miy )-|-i / d 4 x j^(x)A afJ '(x) 

but this integral has no meaning as is because the potential vector can be arbitrarily chosen due to the gauge freedom. 
This problem was already known for quantum electrodynamics but here becomes more severe due to non-abelian 
properties of the gauge group. A way out has been given by Ludvig Faddeev and Victor Popov with the introduction 
of a ghost field (see Faddeev-Popov ghost) that has the property of being unphysical since, although it agrees with 
Fermi-Dirac statistics, it is a complex scalar field, violating in this way the spin-statistics theorem. So, we can write 



Yang-Mills theory 



220 



the generating functional as 



ZfcM] = j [dA][dc][dc]eH 

e i f d 4 xj£ { X )A<W (x)+i f d*x[c» (x)s a (x)c a {x)] 

that is the expression commonly used to derive Feynman's rules (see Feynman diagram). Here we have c a f° r me 
ghost field while a fixes the gauge's choice for the quantization. Feynman's rules obtained from this functional are 
the following 



, P 

01/ 'WAAAA/ a l J 

gluon propagator: D?*(p) = 



-iS ab 
p 2 + i0 




(q - r)tf*x + (r - p)*ifcx] 



4-gluon vertex: l^L = H^/*/*0wifc* -ifcrffc a) 
ghost propagator: C° (p) - 



p 2 + i0 



c .-^ a 



CCS - vertex: F^p) = gf abc Pfl 
These rules for Feynman diagrams are easily obtained when we realize that the generating functional given above 
can be rewritten as 

-ig fd 4 x * , f abc du— r— r#r -*9 f d 4 xf abc duT^-^— r~ rln 
Z[7, e, e] = e i*e°(aO-' ^Ju*) 5e ( E ) e J '■WW *3 CV W x 

2 

^ t 4 j dTxf f sjb ^ x) Sj c (x) Sj r^ x) Sjav{x) _^ ^ 



e 

being 

Z 0 \j,E,e] = E~ J ^^y^{x)C-\x-y)e\y) e \ j ^xd 4 yj-(x)D^(x- y )jUy) 

the generating functional of the free theory. Expanding in 9 and computing the functional derivatives, we are able to 
obtain all the n-point functions with perturbation theory. Using LSZ reduction formula we get from the n-point 
functions the corresponding amplitudes for the given processes and cross sections and decay rates are promptly 
obtained. The theory is renormalizable and corrections are finite at any order of perturbation theory. 

For quantum electrodynamics, being in this case abelian the gauge group (see abelian group), the ghost field 
decouples. This can be easily realized when we look at the coupling between the gauge field and the ghost field that 
is c a / a6c fl l pA bM C c . For the Abelian case all the structure constants f abc are zero and so there is no coupling. In 

the non- Abelian case, the ghost field appears as a useful way to rewrite the quantum field theory without physical 
consequences on the observables of the theory as cross sections or decay rates. 



Yang-Mills theory 



221 



One of the most important results obtained for Yang-Mills theory is asymptotic freedom. This result can be obtained 
assuming the coupling constant 9 as small (so small nonlinearities), as indeed happens to high energies, and 
applying perturbation theory. The relevance of this result is due to the fact that a Yang-Mills theory describes strong 
interactions and asymptotic freedom permits to treat properly experimental results coming from deep inelastic 
scattering. 

In order to obtain the behavior at high energies of the Yang-Mills theory, and so to prove asymptotic freedom, one 
does perturbation theory assuming a small coupling. This is verified a posteriori in the ultraviolet limit. In the 
opposite limit, infrared limit, the situation is quite the opposite being the coupling too large for perturbation theory to 
be reliable. Indeed, most of the difficulties that current research meets is just managing the theory at low energies 
that is the interesting one being inherent to the description of hadronic matter and, more generally, to all the observed 
bound states of gluons and quarks and their confinement (see hadrons). Then, the most used method to study the 
theory in this limit is to try to solve it on computers (see lattice gauge theory). In this case, large computational 
resources are needed to be sure the right limit of infinite volume (smaller lattice spacing) is hit. This is the limit the 
results have to be compared with. Smaller spacing and larger coupling are not independent each others and to 
accomplish both larger computational resources are demanded. As for today, the situation appears somewhat 
satisfactory for the hadronic spectrum and the computation of the gluon and ghost propagators but the glueball and 
hybrids spectra are yet a questioned matter also in view of the experimental observation of such exotic states. Indeed, 
the a resonance^ ^ is not seen in any of such lattice computations and contrasting interpretations have been put 
forward. This is currently a hotly debated issue. 



Beta function 

One of the key properties of a quantum field theory is the behavior at all the energy range of the running coupling. 
Such a behavior can be obtained from a theory once its beta function is known. Our ability of extracting results from 
a quantum field theory relies on perturbation theory. Once the beta function is known, the behavior at all energy 
scales of the running coupling is obtained through the equation 

being a s = <? 2 /47r . Yang-Mills theory has the property of being asymptotically free in the large energy limit 

(ultraviolet limit). This means that, in this limit, beta function has a minus sign driving the behavior of the running 
coupling toward even smaller values as the energy increases. Perturbation theory permits to evaluate beta function in 
this limit producing the following result for SU(N) 

nl \ HJV 2 UN 2 3 , 4 , 

In the opposite limit of low energies (infrared limit), beta function is not known. It is note the exact one for a 
supersymmetric Yang— Mills theory. This has been obtained by Novikov, Shifman, Vainshtein and Zakharov^ and 
can be written as 

PM = - 



With this starting point, Thomas Ryttov and Francesco Sannino were able to postulate a non-supersymmetric version 
UN 1 



i"7] 

of it writing down 



127T 1 ±L£Li 

As can be seen from the beta function of the supersymmetric theory, the limit of a large coupling (infrared limit) 
implies 



Yang-Mills theory 



222 



and so the running coupling in the deep infrared limit goes to zero making this theory trivial. This implies that the 
coupling reaches a maximum at some value of the energy turning again to zero as the energy is lowered. Then, if 
Ryttov and Sannino hypothesis is correct, the same should be true for ordinary Yang-Mills theory. This would be in 
agreement with recent lattice computations . 

Open problems 

The Yang-Mills theories were generally acknowledged in the physics community after Gerard 't Hooft, in 1972, 
could prove their renormalizability. This applies even if the gauge bosons described by this theory are massive, as in 
the electroweak theory. However, the mass is only an "acquired" one, namely, as suggested, by the famous Higgs 
mechanism. 

Concerning the mathematics, it should be noted that presently, i.e. in 2009, the Yang-Mills theory is a very active 
field of research, yielding e.g. a classification of differentiable structures of four-dimensional manifolds by Simon 
Donaldson. Furthermore, the field of Yang-Mills theories was included in the Clay Mathematics Institute's list of 
"Millennium Prize Problems". Here the prize-problem consists, especially, in a proof of the conjecture that the 
lowest excitations of a pure Yang-Mills theory (i.e. without matter fields) have a finite mass-gap with regard to the 
vacuum state. Another open problem, connected with this conjecture, is a proof of the confinement property in the 
presence of additional Fermion particles. 

In physics the survey of Yang-Mills theories does not usually start from perturbation analysis or analytical methods, 
but more recently from systematic application of numerical methods to lattice gauge theories. 

References 

[1] Straumann, N: "On Pauli's invention of non-abelian Kaluza-Klein Theory in 1953" eprint arXiv.gr=qc/00 12054 

[2] See Abraham Pais' account of this period as well as L. Susskind's "Superstrings, Physics World on the first non-abelian gauge theory" where 

Susskind wrote that Yang-Mills was "rediscovered" only because Pauli had chosen not to publish 
[3] Yang, C. N.; Mills, R. (1954), "Conservation of Isotopic Spin and Isotopic Gauge Invariance", Physical Review 96 (1): 191-195, 

doi: 1 0. 1 1 03/Phy sRev.96. 1 9 1 

[4] Caprini, L; Colangelo, G.; Leutwyler, H. (2006), "Mass and width of the lowest resonance in QCD", Physical Review Letters (13): 132001, 
doi: 10. 1 1 03/PhysRevLett.96. 1 32001 

[5] Yndurain, F. J.; Garcia-Martin, R.; Pelaez, J. R. (2007), "Experimental status of the 7T7T isoscalar S wave at low energy: Jo (600) pole 

and scattering length", Physical Review D 76 (7): 074034, doi:10.1103/PhysRevD.76.074034 
[6] Novikov, V. A.; Shifman, M. A.; A. I. Vainshtein, A. L; Zakharov, V. I. (1983), "Exact Gell-Mann-Low Function Of Supersymmetric 

Yang-Mills Theories From Instanton Calculus", Nuclear Physics B 229 (2): 381-393, doi:10.1016/0550-3213(83)90338-3 

[7] Ryttov, T.; Sannino, F. (2008), "Supersymmetry Inspired QCD Beta Function", Physical Review D 78 (6): 065001, 

doi:10.1103/PhysRevD.78.065001 

[8] Bogolubsky, I. L.; Ilgenfritz, E.-M.; A. I. Muller-Preussker, M.; Sternbeck, A. (2009), "Lattice gluodynamics computation of Landau-gauge 
Green's functions in the deep infrared", Physics Letters B 676 (1-3): 69-73, doi:10.1016/j.physletb.2009.04.076 



Yang-Mills theory 



223 



Further reading 

Books 

• Frampton, P. (2008). Gauge Field Theories (3rd ed.). Wiley-VCH. ISBN 978-3527408351. 

• Cheng, T.-P.; Li, L.-F. (1983). Gauge Theory of Elementary Particle Physics. Oxford University Press. 
ISBN 0-19-851961-3. 

• 't Hooft, Gerardus (2005). 50 Years of Yang-Mills theory. World Scientific. ISBN 981-238-934-2. 
Articles 

• Svetlichny, George (1999). " Preparation for Gauge Theory (http://arxiv.org/abs/math-ph/9902027)". 

• Gross, D. (1992). "Gauge theory - Past, Present and Future" (http://psroc.phys.ntu.edu.tw/cjp/v30/955.pdf). 
Retrieved 2009-04-23. 

External links 

• Yang-Mills theory on Dispersive Wiki (http://tosio.math.toronto.edu/wiki/index.php/Yang-Mills_equations) 

• The Clay Mathematics Institute (http://www.claymath.org) 

• The Millennium Prize Problems (http://www.claymath.org/prizeproblems) 



Yangian 

Yangian is an important structure in modern representation theory, a type of a quantum group with origins in 
physics. Yangians first appeared in the work of Ludvig Faddeev and his school concerning the quantum inverse 
scattering method in the late 1970s and early 1980s. Initially they were considered a convenient tool to generate the 
solutions of the quantum Yang-Baxter equation. The name Yangian was introduced by Vladimir Drinfeld in 1985 in 
honor of C.N. Yang. 



Description 

For any finite-dimensional semisimple Lie algebra a, Drinfeld defined an infinite-dimensional Hopf algebra Y(a), 
called the Yangian of a. This Hopf algebra is a deformation of the universal enveloping algebra U(a[z]) of the Lie 
algebra of polynomial loops of a given by explicit generators and relations. The relations can be encoded by 
identities involving a rational /^-matrix. Replacing it with a trigonometric /^-matrix, one arrives at affine quantum 
groups, defined in the same paper of Drinfeld. 

In the case of the general linear Lie algebra gl^ the Yangian admits a simpler description in terms of a single ternary 
(or RTT) relation on the matrix generators due to Faddeev and coauthors. The Yangian Y(gl^ is defined to be the 
algebra generated by elements 1 < i, j < N and p > 0, subject to the relations 



Defining t\ - ^ — Sij , setting 

P >-i 

and introducing the R-matrix R(z) = I + z -1 P on ® C^, where P is the operator permuting the tensor factors, the 
above relations can be written more simply as the ternary relation: 

R 23 {z - w)T 12 {z)T 13 (w) = T 13 (w)T 12 (z)R 23 {z - w). 
The Yangian becomes a Hopf algebra with comultiplication A, counit 8 and antipode s given by 

(A ® id)T(z) = T 12 {z)T 13 (z), (e <g> id)T(z) = /, (s <g> id)T(z) = T(z)-\ 



Yangian 



224 



At special values of the spectral parameter {z — w), the /^-matrix degenerates to a rank one projection. This can be 

used to define the quantum determinant of T{z) , which generates the center of the Yangian. 

The twisted Yangian Y~(g/ ), introduced by G. I. Olshansky, is the sub-Hopf algebra generated by the coefficients 

of 

S{z)=T(z)aT(-z), 

where a is the involution of gl 2N given by 

a (Eij) — ( — l)* +J ^2JV-j + l J 2JV-i+l- 

Applications to classical representation theory 

G.I. Olshansky and I.Cherednik discovered that the Yangian of g/^is closely related with the branching properties of 
irreducible finite-dimensional representations of general linear algebras. In particular, the classical Gelfand-Tsetlin 
construction of a basis in the space of such a representation has a natural interpretation in the language of Yangians, 
studied by M.Nazarov and V.Tarasov. Olshansky, Nazarov and Molev later discovered a generalization of this 
theory to other classical Lie algebras, based on the twisted Yangian. 

Applications to physics 

Yangian appears as a symmetry group in different models in physics. The most famous one is super- symmetric 
Yang-Mills field in four dimensions. Yangian also appears as a symmetry group of one dimensional exactly solvable 
models such as spin chains, Hubbard model ^ and in models of one dimensional relativistic quantum field theory. 

Representation theory of Yangians 

Irreducible finite-dimensional representations of Yangians were parametrized by Drinfeld in a way similar to the 
highest weight theory in the representation theory of semisimple Lie algebras. The role of the highest weight is 
played by a finite set of Drinfeld polynomials. Drinfeld also discovered a generalization of the classical Schur-Weyl 
duality between representations of general linear and symmetric groups that involves the Yangian of sl N and the 
degenerate affine Hecke algebra (graded Hecke algebra of type A, in George Lusztig's terminology). 

Representations of Yangians have been extensively studied, but the theory is still under active development. 

References 

• Chari, Vyjayanthi; Andrew Pressley (1994). A Guide to Quantum Groups. Cambridge, U.K.: Cambridge 
University Press. ISBN 0-521-55884-0. 

• Drinfel'd, Vladimir Gershonovich (1985). M Ajire6pti Xonc})a h KBaHTOBoe ypaBHemie 5mra-EaKCTepa [Hopf 
algebras and the quantum Yang-Baxter equation]" (in Russian). Doklady Akademii Nauk SSSR 283 (5): 
1060-1064. 

• Drinfel'd, V. G. (1987). "[A new realization of Yangians and of quantum affine algebras]" (in Russian). Doklady 
Akademii Nauk SSSR 296 (1): 13-17. Translated in Soviet Mathematics - Doklady 36 (2): 212-216. 1988. 

• Drinfel'd, V. G. (1986). "Btipo^eHHtie acJxjMHHtie ajire6pti Teicxe h ifflrciaHbi [Degenerate affine Hecke algebras 
and Yangians]" (in Russian). Funktsional'nyi Analiz i Ego Prilozheniya 20 (1): 69-70. MR831053, 

Zbl: 0599.20049. Translated in Drinfel'd, V. G. (1986). "Degenerate affine hecke algebras and Yangians". 
Functional Analysis and Its Applications 20 (1): 58-60. doi:10.1007/BF01077318. 

• Molev, Alexander Ivanovich (2007). Yangians and Classical Lie Algebras. Mathematical Surveys and 
Monographs. Providence, RI: American Mathematical Society. ISBN 978-0-8218-4374-1. 



Yangian 



225 



References 

[1] http://arxiv.org/pdf/hep-th/93 10158 

[2] http://www.mathnet.ru/php/getFT.phtml?jrnid=faa&paperid=1254&volume=20&year=1986 
option_lang=eng 

Quantum spacetime 

In mathematical physics quantum spacetime is the proposal that the actual spacetime that we live in is more 
accurately described not by usual local coordinates j/ 5 z, £ but operator or algebra variables where the order of at 
least some of the products matters. The idea is borrowed from the canonical commutation relations in quantum 
mechanics where position and momentum variables ^,Pare mutually noncommutative, but is postulated now for 
relations between one or more of the spacetime variables themselves. Just as with Heisenberg's uncertainty principle, 
a quantum spacetime necessarily comes with uncertainty relations in the sense that not all :r, y, z, £ can 
simultaneously be ascribed actual numerical values. 

There are fundamental physical reasons to believe that spacetime is better modeled in this way. Because of wave 
particle duality the energy needed to probe smaller and smaller distances would be greater and greater, until the test 
particles doing the probing would form black holes and destroy the very geometry trying to be measured. This means 
that our usual picture of continuum spacetime must itself break down as we approach such Planck scale distances. 
Quantum spacetime is a particular attempt to address this using ideas from quantum mechanics and is plausibly 
expected on the grounds that the corrections to geometry are being induced by quantum gravity. 

The mathematics allowing such a physical possibility has been developed under the general headings of 
noncommutative geometry and quantum geometry. In practice the more well-known Connes approach to 
noncommutative geometry has not been applied so much here and the currently best-known models of quantum 
spacetime fall within a more pedestrian quantum groups approach to noncommutative geometry, based on symmetry. 

It is important to insist that any noncommutative algebra with four generators qualifies properly to be a called 
quantum spacetime. One should require at least the following: 

• There should be a plausible expectation that such an algebra might actually arise in an effective description of 
quantum gravity effects in some regime of that theory. In this context there should be a parameter \ , say, 
controlling the extent of deviation from ordinary spacetime and ultimately identifiable with physical constants 
such as the Planck length. One should obtain ordinary Lorentzian spacetime as \ _ > 0 . 

• Local Lorentz group and Poincare group symmetries should be retained in some sufficient but possibly 
generalised form. These symmetries of ordinary spacetime are needed for the formulation of Special Relativity 
and their generalisation often takes the form of a quantum group acting on the quantum spacetime algebra. 

• There should be a notion of quantum differential calculus on the quantum spacetime algebra, compatible with the 
(quantum) symmetry and preferably reducing to classical high school differential calculus as \ —> Q . 

These and some partial understanding of the rest of the story are the minimum needed to have wave equations for 
particles and fields and hence first predictions for the deviations from classical spacetime physics. This in turn is 
needed if the theory is ever to be tested. 

Several models were found in the 1990s meeting the above criteria to lesser or greater extents. The most important of 
these has currently testable predictions making the search for quantum spacetime not only a theoretical fancy but a 
potentially an actual new discovery about our physical world. 



Quantum spacetime 



226 



Bicrossproduct model spacetime 

Was introduced in 1994 by Shahn Majid and Henri Ruegg^ and has relations 

for spatial variables X{ and the one time variable t • Here \ has dimensions of length and is therefore expected to 
be something like the Planck length. The Poincare group here is correspondingly deformed, now to a certain 
bicrossproduct quantum group with the following characteristic features. 

The momentum generators Pi commute among themselves but 
addition of momenta, reflected in the quantum group structure, is 
deformed (momentum space becomes a non-abelian group). 
Meanwhile, the Lorentz group generators enjoy their usual relations 
among themselves but act non-linearly on the momentum space. The 
orbits for this action are depicted in the figure as a cross-section of Po o 
against one of the Pi . The on- shell region describing particles in the 
upper centre of the image would normally be hyperboloids but these 
are now v squashed up' into the cylinder 




-2-10 1 2 

Orbits for the action of the Lorentz group on 
momentum space in the construction of the 
bicrossproduct model in units of \~ 1 . 
Mass-shell hyperboloids are v squashed' into a 
cylinder. 



in simplified units. The upshot is that as you try to Lorentz-boost the momentum of a particle you will never exceed 
the Planck momentum. The existence of a highest momentum scale or lowest distance scale fits the physical picture. 
The existence of such squashing behaviour comes from the non-linearity of the action and is an endemic feature of 
bicrossproduct quantum groups known since their introduction in 1988 . Some physicists have been so impressed 
by this feature of the bicrossproduct model that they have dubbed it doubly special relativity but such nomenclature 
remains controversial and disputed. 

Another consequence of the squashing is that the propagation of particles is deformed, even of light, leading to a 

variable speed of light prediction. Key to this prediction was a reason to believe that the particular Po?Piare 

plausibly the physical energy and spatial momentum (as opposed to some other function of them), provided in 1999 

T31 

by Giovanni Amelino-Camelia and Majid through a study of plane waves for a quantum differential calculus in the 
model. They take the form 

in other words a form which is sufficiently close to classical that one might plausibly believe the interpretation. At 
the moment such wave analysis represents the best hope to obtain physically testable predictions form the model. 

Prior to this work there were a number of unsupported claims to make predictions from the model based solely on 

the form of the Poincare quantum group. There were also claims based on an earlier ft -Poincare quantum group 

Mi 

introduced by Jurek Lukierski and co-workers which should be viewed as an important precursor to the 
bicrossproduct one, albeit without the actual quantum spacetime and with different proposed generators for which 
the above picture does not apply. The bicrossproduct model spacetime has also been called ft -deformed spacetime 

with ft = A" 1 



Quantum spacetime 



227 



(/-Deformed spacetime 

Was introduced independently by a team^ working under Julius Wess in 1990 and by Majid and coworkers in a 
series of papers on braided matrices starting a year later^ . The point of view in the second approach is that usual 
Minkowski spacetime has a nice description via Pauli matrices as the space of 2 x 2 hermitian matrices. In quantum 
group theory and using braided monoidal category methods one has a natural q- version of this defined here for real 
values of q as a v braided hermitian matrix' of generators and relations 

(7 s) = (7 P a = ^ M] = °' [/3 ' 7] = (i-'TX*-")' ftfl = (l-T 2 )^ 

These relations say that the generators commute as q —> 1 thereby recovering usual Minkowski space. One can 
work with more familiar variables 2, y , z, £ as linear combinations of these. In particular, time 

t = Trace g ^ = q8 + q~ x a 

is given by a natural braided trace of the matrix and commutes with the other generators (so this model has a very 
different flavour from the bicrossproduct one). The braided-matrix picture also leads naturally to a quantity 

^(7 s) 

which as q —> 1 returns us the usual Minkowski distance (this translates to a metric in the quantum differential 
geometry). The parameter q = e X or q = e lA is dimensionless and \ is thought to be a ratio of the Planck scale 

and the cosmological length. That is, there are indications that that this model relates to quantum gravity with 
non-zero cosmological constant, the choice of q depending on whether this is positive or negative. We have 

described the mathematically better understood but perhaps less physically justified positive case here. 
A full understanding of this model requires (and was concurrent with the development of) a full theory of v braided 
linear algebra' for such spaces. The momentum space for the theory is another copy of the same algebra and there is 
a certain "braided addition' of momentum on it expressed as the structure of a braided Hopf algebra or quantum 
group in a certain braided monoidal category). This theory by 1993 had provided the corresponding q -deformed 
Poincare group as generated by such translations and q -Lorentz transformations, completing the interpretation as a 
quantum spacetime . 

In the process it was discovered that the Poincare group not only had to be deformed but had to be extended to 
include dilations of the quantum spacetime. For such a theory to be exact we would need all particles in the theory to 
be massless, which is consistent with experiment as masses of elementary particles are indeed vanishingly small 
compared to the Planck mass. If current thinking in cosmology is correct then this model is more appropriate, but it 
is significantly more complicated and for this reason its physical predictions have yet to be worked out. 



Fuzzy or spin model spacetime 

Refers in modern usage to the angular momentum algebra 

[2:1,2:2] = 2zAx3, [2:2,2:3] = 2iAxi, [2:3,2:1] = 2iXx2 
familiar from quantum mechanics but regarded now as coordinates of a quantum space or spacetime. This is 
primarily of interest as a toy model of quantum gravity where we work only in 3 spacetime dimensions (not the 
correct 4) and here presented with a Euclidean not Lorentzian signature. It was first proposed in this context by 
Geradus 't Hooft while its full development including a quantum differential calculus and an action of a certain 

rm 

v quantum double' quantum group as deformed Euclidean group of motions was obtained by Majid and E. Batista 

A striking feature of the noncommutative geometry here is that the smallest covariant quantum differential calculus 
has one dimension higher than expected, namely 4, suggesting that the above can also be viewed as the spatial part 
of a 4-dimensional quantum spacetime. The model should not be confused with fuzzy spheres which are 



Quantum spacetime 



228 



finite-dimensional matrix algebras which one can think of as spheres in the spin model spacetime of fixed radius. 

Heisenberg model spacetimes 

The first concrete model of quantum spacetime is often attributed to Hartland Snyder in 1947, however his paper'- 10 -' 
actually proposes that 

where generate and are interpreted as the Lorentz group. This is not an actual quantum spacetime in the sense 
above because the spacetime coordinates %fi do not form a self-contained algebra among themselves, rather Snyder 
was proposing a radical unification of spacetime with the Lorentz and Poincare groups. 

The idea was revived in a modern context by Sergio Doplicher, Claus Fredenhagen and John Roberts in 1995 '- 11 -' by 
letting M^v simply be viewed as some function of %fi as defined by the above relation, and any relations involving 
it viewed as higher order relations among the . The Lorentz symmetry is arranged so as to transform the indices 
as usual and without being deformed. 

An even simpler variant of this model is to let M nere be a numerical antisymmetric tensor, in which context it is 
usually denoted Q , so the relations are [x^, xj\ = i6^ u . In even dimensions ]J any nondegenerate such theta 

can be transformed to a normal form in which this really is just the Heisenberg algebra but the difference that the 
variables are being proposed as those of spacetime. This proposal was for a time quite popular because of its familiar 
form of relations and because it has been argued that it emerges from the theory of open strings landing on 
D-branes, see noncommutative quantum field theory and Moyal plane. However, it should be realised that this 
D-brane lives in some of the higher spacetime dimensions in the theory and hence it is not our physical spacetime 
that string theory suggests to be effectively quantum in this way. You also have to subscribe to D-branes as an 
approach to quantum gravity in the first place. Even when posited as quantum spacetime it is hard to obtain physical 
predictions and one reason for this is that if Q is a tensor then by dimensional analysis it should have dimensions of 
length 2 , and if this length is speculated to be the Planck length then the effects would be even harder to ever detect 
than for other models. 

Noncommutative extensions to spacetime 

Although not quantum spacetime in the sense above, another use of noncommutative geometry is to tack on 
v noncommutative extra dimensions' at each point of ordinary spacetime. Instead of invisible curled up extra 
dimensions as in string theory, Alain Connes and coworkers have argued that the coordinate algebra of this extra part 
should be replaced by a finite-dimensional noncommutative algebra. For a certain reasonable choice of this algebra, 
its representation and extended Dirac operator, one is able to recover the Standard Model of elementary particles. In 
this point of view the different kinds of matter particles are manifestations of geometry in these extra 
noncommutative directions. Connes first works here date from 1989 but has been developed considerably since 
then. Such an approach can theoretically be combined with quantum spacetime as above. 



Quantum spacetime 



229 



References 

[I] Majid, S.; Ruegg, H. (1994), "Bicrossproduct structure of the AC -Poincare group and noncommutative geometry", Physics Letters B 334: 
348-354, doi:10.1016/0370-2693(94)90699-8 

[2] Majid, Shahn (1988), "Hopf algebras for physics at the Planck scale", Classical and Quantum Gravity 5: 1587-1607, 
doi: 10. 1088/0264-9381/5/12/010 

[3] Amelino-Camelia, G.; Majid, S. (2000), "Waves on noncommutative spacetime and gamma-ray bursts", International J. Mod. Phys. A 15: 
4301-4323 

[4] Lukierski, J; Nowicki, A; Ruegg, H; Tolstoy, V.N. (1991), " Q -Deformation of Poincare algebras", Physics Letters B 268: 331-338 

[5] Carow-Watamura, U.; Schlieker, M.; Scholl, M.; Watamura, S. (1990), "Tensor representation of the quantum group S Lq (2, C) and 

quantum Minkowski space", Z. Phys. C 48: 159, doi:10.1007/BF01565619 
[6] Majid, S. (1991), "Examples of braided groups and braided matrices", J. Math. Phys. 32: 3246-3253, doi: 10.1063/1 .529485 

[7] Majid, S. (1993), "Braided momentum in the q-Poincare group", J. Math. Phys. 34: 2045-2058, doi:10.1063/1.530154 

[8] 't Hooft, G. (1996), "Quantization of point particles in (2 + 1) -dimensional gravity and spacetime discreteness", Classical and Quantum 

Gravity 13: 1023-1039, doi: 10.1088/0264-9381/13/5/018 
[9] Batista, E.; Majid, S. (2003), "Noncommutative geometry of angular momentum space U(su_2)", J. Math. Phys. 44: 107-137, 

doi: 10. 1063/1. 1517395 
[10] Snyder, H. (1947), "Quantized space-time", Phys. Rev. D 67: 38-41 

[II] Doplicher, S.; Fredenhagen, K.; Roberts, J.E. (1995), "The quantum structure of spacetime at the Planck scale and quantum fields", 
Commun. Math. Phys. 172: 187-220, doi:10.1007/BF02104515 

[12] Seiberg, N.; Witten, E. (1999), "String theory and noncommutative geometry", JHEP: 9909;032 
[13] Connes, A.; Lott, J. (1989), "Particle models and noncommutative geometry", Nucl. Phys. Proc. Suppl. B 18: 29, 
doi:10.1016/0920-5632(91)90120-4 

Further reading 

• Majid, S. (1995), Foundations of Quantum Group Theory, Cambridge University Press 

• D. Oriti, ed. (2009), Approaches to Quantum Gravity, Cambridge University Press 

• Connes, A.; Marcolli, M. (2007), Noncommutative Geometry, Quantum Fields and Motives, Colloquium 
Publications 

• Majid, S.; Schroers, BJ. (2009), " q -Deformation and semidualization in 3D quantum gravity", /. Phys. A: 
Math. Theor. 42: 425402 (40pp), doi: 10.1088/1751-81 13/42/42/425402 

• R. P. Grimaldi, Discrete and Combinatorial Mathematics: An Applied Introduction, 4th Ed. Addison- Wesley 
1999. 

• J. Matousek, J. Nesetril, Invitation to Discrete Mathematics. Oxford University Press 1998. 

• Taylor E. F., John A. Wheeler, Spacetime Physics, publisher W. H. Freeman, 1963. 

External links 

• Plus Magazine article on quantum geometry (http://plus.maths.org/issue43/features/noncom/index-gifd. 
html) by Marianne Freiberger 

• S. Majid, ed. (2008), On Space and Time, Cambridge University Press (http://www.ewidgetsonline.com/ 
dxreader/Reader.aspx?token=KSvQC0T+8xw7PY+iwDj29A==&rand=1970407385&buyNowLink=http:// 
www.cambridge.org/us/catalogue/ AddToBasket.asp?isbn=9780521889261) 



Quantum gauge theory 



230 



Quantum gauge theory 

In order to quantize a gauge theory, like for example Yang-Mills theory, Chern-Simons or BF model, one method is 
to perform a gauge fixing. This is done in the BRST and Batalin-Vilkovisky formulation. Another is to factor out the 
symmetry by dispensing with vector potentials altogether (they're not physically observable anyway) and work 
directly with Wilson loops, Wilson lines contracted with other charged fields at its endpoints and spin networks. 

Older approaches to quantization for Abelian models use the Gupta-Bleuler formalism with a "semi-Hilbert space" 
with an indefinite sesquilinear form. However, it is much more elegant to just work with the quotient space of vector 
field configurations by gauge transformations. 

An alternative approach using lattice approximations is covered in (Wick rotated) lattice gauge theory. 

To establish the existence of the Yang-Mills theory and a mass gap is one of the seven Millennium Prize Problems of 
the Clay Mathematics Institute. 



Standard Model 



Three Generations 
of Matter (Fermions) 



mass-* 
charge-* 
spin-* 
name-* 



in 

J* 

fD 

=3 

o 



The standard model of particle physics is a 
theory concerning the electromagnetic, 
weak, and strong nuclear interactions, which 
mediate the dynamics of the known 
subatomic particles. Developed throughout 
the early and middle 20th century, the 
current formulation was finalized in the mid 
1970s upon experimental confirmation of 
the existence of quarks. Since then, 
discoveries of the bottom quark (1977), the 
top quark (1995) and the tau neutrino (2000) 
have given credence to the standard model. 
Because of its success in explaining a wide 
variety of experimental results, the standard 
model is sometimes regarded as a theory of 
almost everything. 

Still, the standard model falls short of being 
a complete theory of fundamental 
interactions because it does not incorporate 
the physics of general relativity, such as 
gravitation and dark energy. The theory 
does not contain any viable dark matter 

particle that possesses all of the required properties deduced from observational cosmology. It also does not correctly 
account for neutrino oscillations (and their non-zero masses). Although the standard model is theoretically 
self-consistent, it has several unnatural properties giving rise to puzzles like the strong CP problem and the hierarchy 
problem. 

Nevertheless, the standard model is important to theoretical and experimental particle physicists alike. For 
theoreticians, the standard model is a paradigm example of a quantum field theory, which exhibits a wide range of 
physics including spontaneous symmetry breaking, anomalies, non-perturbative behavior, etc. It is used as a basis for 



in 
c 
o 

-I— ' 

Q_ 



2.4 MeV 

up 


1.27 GeV 

charm 


171.2 GeV 

top 


0 

l Y 

photon 


4.8 MeV 

::6 

down 


104 MeV 

strange 


4.2 GeV 

bottom 


0 

i g 

gluon 


<2.2 eV 

electron 
neutrino 


<0.17 MeV 

IV v 

muon 
neutrino 


<15.5 MeV 

tau 
neutrino 


91.2 GeV q 

: z 

weak 
force 


0.511 MeV 

1 p 

electron 


105.7 MeV 

* |i 

muon 


1.777 GeV 

* T 

tau 


80.4 GeV 

"W 

weak 
force 



in 
CD 
u 



in 
c 
o 

in 
O 



The Standard Model of elementary particles, with the gauge bosons in the 
rightmost column. 



Standard Model 



231 



building more exotic models which incorporate hypothetical particles, extra dimensions and elaborate symmetries 
(such as supersymmetry) in an attempt to explain experimental results at variance with the standard model such as 
the existence of dark matter and neutrino oscillations. In turn, the experimenters have incorporated the standard 
model into simulators to help search for new physics beyond the standard model from relatively uninteresting 
background. 

Recently, the standard model has found applications in other fields besides particle physics such as astrophysics and 
cosmology, in addition to nuclear physics. 

Historical background 

The first step towards the Standard Model was Sheldon Glashow's discovery, in 1960, of a way to combine the 
electromagnetic and weak interactions J 1] In 1967, Steven Weinberg 1 ^ and Abdus Salam 1 ^ incorporated the Higgs 
mechanism^ ^ ^ into Glashow's electroweak theory, giving it its modern form. 

The Higgs mechanism is believed to give rise to the masses of all the elementary particles in the Standard Model. 
This includes the masses of the W and Z bosons, and the masses of the fermions - i.e. the quarks and leptons. 

After the neutral weak currents caused by Z boson exchange were discovered at CERN in 1973, [7] [8] [9] [10] the 
electroweak theory became widely accepted and Glashow, Salam, and Weinberg shared the 1979 Nobel Prize in 
Physics for discovering it. The W and Z bosons were discovered experimentally in 1981, and their masses were 
found to be as the Standard Model predicted. 

The theory of the strong interaction, to which many contributed, acquired its modern form around 1973-74, when 
experiments confirmed that the hadrons were composed of fractionally charged quarks. 

Overview 

At present, matter and energy are best understood in terms of the kinematics and interactions of elementary particles. 
To date, physics has reduced the laws governing the behavior and interaction of all known forms of matter and 
energy to a small set of fundamental laws and theories. A major goal of physics is to find the "common ground" that 
would unite all of these theories into one integrated theory of everything, of which all the other known laws would 
be special cases, and from which the behavior of all matter and energy could be derived (at least in principle).^ 1 ^ 

The Standard Model groups two major extant theories — quantum electroweak and quantum chromodynamics — into 
an internally consistent theory that describes the interactions between all known particles in terms of quantum field 
theory. For a technical description of the fields and their interactions, see Standard Model (mathematical 
formulation). 

Particle content 



Fermions 



Standard Model 



232 



Organization of Fermions 





Charge 


First generation 


Second generation 


Third generation 


Quarks 


+% 


Up 


u 


Charm 


c 


Top 


t 




-X 


Down 


d 


Strange 


s 


Bottom 


b 


Leptons 


-1 


Electron 


e~ 


Muon 


|T 


Tau 


x~ 




0 


Electron neutrino 


V 

e 


Muon neutrino 


V 


Tau neutrino 


V 

X 



The Standard Model includes 12 elementary particles of spin- known as fermions. According to the spin- statistics 
theorem, fermions respect the Pauli exclusion principle. Each fermion has a corresponding antiparticle. 

The fermions of the Standard Model are classified according to how they interact (or equivalently, by what charges 
they carry). There are six quarks (up, down, charm, strange, top, bottom), and six leptons (electron, electron 
neutrino, muon, muon neutrino, tau, tau neutrino). Pairs from each classification are grouped together to form a 
generation, with corresponding particles exhibiting similar physical behavior (see table). 

The defining property of the quarks is that they carry color charge, and hence, interact via the strong interaction. A 
phenomenon called color confinement results in quarks being perpetually (or at least since very soon after the start of 
the Big Bang) bound to one another, forming color-neutral composite particles (hadrons) containing either a quark 
and an antiquark (mesons) or three quarks (baryons). The familiar proton and the neutron are the two baryons having 
the smallest mass. Quarks also carry electric charge and weak isospin. Hence they interact with other fermions both 
electromagnetically and via the weak nuclear interaction. 

The remaining six fermions do not carry color charge and are called leptons. The three neutrinos do not carry electric 
charge either, so their motion is directly influenced only by the weak nuclear force, which makes them notoriously 
difficult to detect. However, by virtue of carrying an electric charge, the electron, muon, and tau all interact 
electromagnetically. 

Each member of a generation has greater mass than the corresponding particles of lower generations. The first 
generation charged particles do not decay; hence all ordinary (baryonic) matter is made of such particles. 
Specifically, all atoms consist of electrons orbiting atomic nuclei ultimately constituted of up and down quarks. 
Second and third generations charged particles, on the other hand, decay with very short half lives, and are observed 
only in very high-energy environments. Neutrinos of all generations also do not decay, and pervade the universe, but 
rarely interact with baryonic matter. 



Standard Model 



233 



Gauge bosons 

In the Standard Model, gauge bosons 
are force carriers that mediate the 
strong, weak, and electromagnetic 
fundamental interactions. 

Interactions in physics are the ways 
that particles influence other particles. 
At a macroscopic level, 
electromagnetism allows particles to 
interact with one another via electric 
and magnetic fields, and gravitation 
allows particles with mass to attract 
one another in accordance with 
Einstein's general relativity. The 
standard model explains such forces as 
resulting from matter particles 
exchanging other particles, known as 
force mediating particles (Strictly speaking, this is only so if interpreting literally what is actually an approximation 
method known as perturbation theory, as opposed to the exact theory). When a force mediating particle is exchanged, 
at a macroscopic level the effect is equivalent to a force influencing both of them, and the particle is therefore said to 
have mediated (i.e., been the agent of) that force. The Feynman diagram calculations, which are a graphical form of 
the perturbation theory approximation, invoke "force mediating particles" and when applied to analyze high-energy 
scattering experiments are in reasonable agreement with the data. Perturbation theory (and with it the concept of 
"force mediating particle") in other situations fails. These include low-energy QCD, bound states, and solitons. 

The gauge bosons of the Standard Model also all have spin (as do matter particles), but in their case, the value of the 
spin is 1, making them bosons. As a result, they do not follow the Pauli exclusion principle. The different types of 
gauge bosons are described below. 

• Photons mediate the electromagnetic force between electrically charged particles. The photon is massless and is 
well-described by the theory of quantum electrodynamics. 

• The W + , W~, and Z gauge bosons mediate the weak interactions between particles of different flavors (all quarks 
and leptons). They are massive, with the Z being more massive than the W ± . The weak interactions involving the 
W ± act on exclusively left-handed particles and right-handed antiparticles. Furthermore, the W ± carry an electric 
charge of +1 and -1 and couple to the electromagnetic interactions. The electrically neutral Z boson interacts with 
both left-handed particles and antiparticles. These three gauge bosons along with the photons are grouped together 
which collectively mediate the electro weak interactions. 

• The eight gluons mediate the strong interactions between color charged particles (the quarks). Gluons are 

massless. The eightfold multiplicity of gluons is labeled by a combination of color and an anticolor charge (e.g., 
ri2i 

red-antigreen). Because the gluon has an effective color charge, they can interact among themselves. The 
gluons and their interactions are described by the theory of quantum chromodynamics. 

The interactions between all the particles described by the Standard Model are summarized by the diagram at the top 
of this section. 




Standard Model 



234 



Higgs boson 

The Higgs particle is a hypothetical massive scalar elementary particle theorized by Robert Brout, Francois Englert, 
Peter Higgs, Gerald Guralnik, C. R. Hagen, and Tom Kibble in 1964 (see 1964 PRL symmetry breaking papers) and 
is a key building block in the Standard Model J 13 ^ ^ ^ ^ It has no intrinsic spin, and for that reason is classified 
as a boson (like the gauge bosons, which have integer spin). Because an exceptionally large amount of energy and 
beam luminosity are theoretically required to observe a Higgs boson in high energy colliders, it is the only 
fundamental particle predicted by the Standard Model that has yet to be observed. 

The Higgs boson plays a unique role in the Standard Model, by explaining why the other elementary particles, the 
photon and gluon excepted, are massive. In particular, the Higgs boson would explain why the photon has no mass, 
while the W and Z bosons are very heavy. Elementary particle masses, and the differences between 
electromagnetism (mediated by the photon) and the weak force (mediated by the W and Z bosons), are critical to 
many aspects of the structure of microscopic (and hence macroscopic) matter. In electroweak theory, the Higgs 
boson generates the masses of the leptons (electron, muon, and tau) and quarks. 

As yet, no experiment has directly detected the existence of the Higgs boson. It is hoped that the Large Hadron 

Collider at CERN will confirm the existence of this particle. It is also possible that the Higgs boson may already 

ri7i 

have been produced but overlooked. 

Field content 

The standard model has the following fields: 

Spinl 

1. A U(l) gauge field B^ with coupling g' (weak U(l), or weak hypercharge) 

2. An SU(2) gauge field W with coupling g (weak SU(2), or weak isospin) 

3. An SU(3) gauge field with coupling g^ (strong SU(3), or color charge) 

Spin V 2 

The spin V 2 particles are in representations of the gauge groups. For the U(l) group, we list the value of the weak 
hypercharge instead. The left-handed fermionic fields are: 

1. An SU(3) triplet, SU(2) doublet, with U(l) weak hypercharge V (left-handed quarks) 

2. An SU(3) triplet, SU(2) singlet, with U(l) weak hypercharge / (left-handed down-type antiquark) 

3. An SU(3) singlet, SU(2) doublet with U(l) weak hypercharge -1 (left-handed lepton) 

4. An SU(3) triplet, SU(2) singlet, with U(l) weak hypercharge - 4 / 3 (left-handed up-type antiquark) 

5. An SU(3) singlet, SU(2) singlet with U(l) weak hypercharge 2 (left-handed antilepton) 

By CPT symmetry, there is a set of right-handed fermions with the opposite quantum numbers. 

This describes one generation of leptons and quarks, and there are three generations, so there are three copies of each 
field. Note that there are twice as many left-handed lepton field components as left-handed antilepton field 
components in each generation, but an equal number of left-handed quark and antiquark fields. 



Standard Model 



235 



SpinO 

1. An SU(2) doublet H with U(l) hyper-charge -1 (Higgs field) 

2 

Note that \H\ , summed over the two SU(2) components, is invariant under both SU(2) and under U(l), and so it can 
appear as a renormalizable term in the Lagrangian, as can its square. 

This field acquires a vacuum expectation value, leaving a combination of the weak isospin, and weak hypercharge 
unbroken. This is the electromagnetic gauge group, and the photon remains massless. The standard formula for the 
electric charge (which defines the normalization of the weak hypercharge, Y, which would otherwise be somewhat 
arbitrary) is:^ 

Y 

Q = h + T 

Lagrangian 

The Lagrangian for the spin 1 and spin V fields is the most general renormalizable gauge field Lagrangian with no 
fine tunings: 

• Spin 1: 

where the traces are over the SU(2) and SU(3) indices hidden in W and G respectively. The two-index objects are the 
field strengths derived from W and G the vector fields. There are also two extra hidden parameters: the theta angles 
for SU(2) and SU(3). 

The spin-V 2 particles can have no mass terms because there is no right/left helicity pair with the same SU(2) and 
SU(3) representation and the same weak hypercharge. This means that if the gauge charges were conserved in the 
vacuum, none of the spin V particles could ever swap helicity, and they would all be massless. 

For a neutral fermion, for example a hypothetical right-handed lepton N (or N 01 in relativistic two-spinor notation), 
with no SU(3), SU(2) representation and zero charge, it is possible to add the term: 

J MN a N^e af} + N A Nfit&. 

This term gives the neutral fermion a Majorana mass. Since the generic value for M will be of order 1, such a particle 
would generically be unacceptably heavy. The interactions are completely determined by the theory - the leptons 
introduce no extra parameters. 

Higgs mechanism 

The Lagrangian for the Higgs includes the most general renormalizable self interaction: 

5 Higgs = f d*x [(D li H)*(D' l H) + \{\H\ 2 - v 2 f] ■ 

2 

The parameter v has dimensions of mass squared, and it gives the location where the classical Lagrangian is at a 

2 

minimum. In order for the Higgs mechanism to work, v must be a positive number, v has units of mass, and it is the 
only parameter in the standard model which is not dimensionless. It is also much smaller than the Planck scale; it is 
approximately equal to the Higgs mass, and sets the scale for the mass of everything else. This is the only real 
fine-tuning to a small nonzero value in the standard model, and it is called the Hierarchy problem. 

It is traditional to choose the SU(2) gauge so that the Higgs doublet in the vacuum has expectation value (v,0). 



Standard Model 



236 



Masses and CKM matrix 

The rest of the interactions are the most general spin-0 spin- 1 /^ Yukawa interactions, and there are many of these. 
These constitute most of the free parameters in the model. The Yukawa couplings generate the masses and mixings 
once the Higgs gets its vacuum expectation value. 

The terms L HR generate a mass term for each of the three generations of leptons. There are 9 of these terms, but by 
relabeling L and R, the matrix can be diagonalized. Since only the upper component of H is nonzero, the upper 
SU(2) component of L mixes with R to make the electron, the muon, and the tau, leaving over a lower massless 
component, the neutrino. {Neutrino oscillation show neutrinos have mass, http://operaweb.lngs.infn.it/spip. 
php?rubriquel4 31May2010 Press Release.} 

The terms QHU generate up masses, while QHD generate down masses. But since there is more than one 
right-handed singlet in each generation, it is not possible to diagonalize both with a good basis for the fields, and 
there is an extra CKM matrix. 

Theoretical aspects 

Construction of the Standard Model Lagrangian 



Parameters of the Standard Model 



Symbol 


Description 


Renormalization 
scheme (point) 


Value 


m 

e 


Electron mass 




511 keV 


m 

\i 


Muon mass 




105.7 MeV 


m 

X 


Tau mass 




1.78 GeV 


m 

u 


Up quark mass 


"MS = 2GeV 


1.9 MeV 


m 

d 


Down quark mass 


"MS = 2GeV 


4.4 MeV 


m 

s 


Strange quark mass 


"MS = 2GeV 


87 MeV 


m 

c 


Charm quark mass 


"MS = m c 


1.32 GeV 


m, 
b 


Bottom quark mass 


"MS = m b 


4.24 GeV 


m 

t 


Top quark mass 


On-shell scheme 


172.7 GeV 


9 n 


CKM 12-mixing angle 




13.1° 


9 23 


CKM 23 -mixing angle 




2.4° 


°n 


CKM 13-mixing angle 




0.2° 


6 


CKM CP-violating Phase 




0.995 


h 


U(l) gauge coupling 


"MS = m z 


0.357 


8 2 


SU(2) gauge coupling 


"MS = m z 


0.652 


8 3 


SU(3) gauge coupling 


"MS = m z 


1.221 


e 

QCD 


QCD vacuum angle 




~0 


n 


Higgs quadratic coupling 




Unknown 




Higgs self-coupling strength 




Unknown 



Technically, quantum field theory provides the mathematical framework for the standard model, in which a 
Lagrangian controls the dynamics and kinematics of the theory. Each kind of particle is described in terms of a 



Standard Model 



237 



dynamical field that pervades space-time. The construction of the standard model proceeds following the modern 
method of constructing most field theories: by first postulating a set of symmetries of the system, and then by writing 
down the most general renormalizable Lagrangian from its particle (field) content that observes these symmetries. 

The global Poincare symmetry is postulated for all relativistic quantum field theories. It consists of the familiar 
translational symmetry, rotational symmetry and the inertial reference frame in variance central to the theory of 
special relativity. The local SU(3)xSU(2)xU(l) gauge symmetry is an internal symmetry that essentially defines the 
standard model. Roughly, the three factors of the gauge symmetry give rise to the three fundamental interactions. 
The fields fall into different representations of the various symmetry groups of the Standard Model (see table). Upon 
writing the most general Lagrangian, one finds that the dynamics depend on 19 parameters, whose numerical values 
are established by experiment. The parameters are summarized in the table at right. 

The QCD sector 

The QCD sector defines the interactions between quarks and gluons, with SU(3) symmetry, generated by T a . Since 
leptons do not interact with gluons, they are not affected by this sector. 

C-qcd = U(d» - ig s G°T a )rU + D{d» - ig^^-fD. 

is the gluon field strength, 7^ are the Dirac matrices, D stands for the isospin doublet section, U stands for a 
unitary matrix, and g § is the strong coupling constant. 

The electroweak sector 

The electroweak sector is a Yang-Mills gauge theory with the symmetry group U(l)xSU(2) L , 



1/1 ^ ' 



where is the U(l) gauge field; is the weak hypercharge — the generator of the U(l) group; W^is the 

three-component SU(2) gauge field; 7^ are the Pauli matrices — infinitesimal generators of the SU(2) group. The 
subscript L indicates that they only act on left fermions; g' and g are coupling constants. 

The Higgs sector 

In the Standard Model, the Higgs field is a complex spinor of the group SU(2) L : 



where the indexes + and 0 indicate the electric charge (Q) of the components. The weak isospin (Y^) of both 
components is 1. 

Before symmetry breaking, the Higgs Lagrangian is: 

A 2 



£h = (d, - \ (g'YwB, + grW,) ){%+\ (<7%^ + 9^) ) <P ~ T faV - , 



which can also be written as: 



d^ + ^g'YwB^ + grW^tp 



v2 



X 2 



Standard Model 



238 



Additional symmetries of the Standard Model 

From the theoretical point of view, the Standard Model exhibits four additional global symmetries, not postulated at 
the outset of its construction, collectively denoted accidental symmetries, which are continuous U(l) global 
symmetries. The transformations leaving the Lagrangian invariant are: 

V q (z) -> e-/ 3 ^ 

E L -> e if, E L and (e R ) c -> e ip (e R ) c 
M L -> M L and -> e^(^) c 

T L -> e^T L and (t*) c -> e^(r fl ) c . 
The first transformation rule is shorthand meaning that all quark fields for all generations must be rotated by an 
identical phase simultaneously. The fields , and (fJ,R) c , (r^) c are the 2nd (muon) and 3rd (tau) 
generation analogs of i^and (e^) c fields. 

By Noether's theorem, each symmetry above has an associated conservation law: the conservation of baryon number, 
electron number, muon number, and tau number. Each quark is assigned a baryon number of 1/3, while each 
antiquark is assigned a baryon number of -1/3. Conservation of baryon number implies that the number of quarks 
minus the number of antiquarks is a constant. Within experimental limits, no violation of this conservation law has 
been found. 

Similarly, each electron and its associated neutrino is assigned an electron number of +1, while the antielectron and 
the associated antineutrino carry -1 electron number. Similarly, the muons and their neutrinos are assigned a muon 
number of +1 and the tau leptons are assigned a tau lepton number of +1. The Standard Model predicts that each of 
these three numbers should be conserved separately in a manner similar to the way baryon number is conserved. 
These numbers are collectively known as lepton family numbers (LF). Symmetry works differently for quarks than 
for leptons, mainly because the Standard Model predicts that neutrinos are massless. However, it was recently found 
that neutrinos have small masses and oscillate between flavors, signaling that the conservation of lepton family 
number is violated. 

In addition to the accidental (but exact) symmetries described above, the Standard Model exhibits several 
approximate symmetries. These are the M SU(2) custodial symmetry" and the M SU(2) or SU(3) quark flavor 
symmetry." 



Symmetries of the Standard Model and Associated Conservation Laws 



Symmetry 


Lie Group 


Symmetry Type 


Conservation Law 


Poincare 


TranslationsxSO(3,l) 


Global symmetry 


Energy, Momentum, Angular momentum 


Gauge 


SU(3)xSU(2)xU(l) 


Local symmetry 


Color charge, Weak isospin, Electric charge, Weak hypercharge 


Baryon phase 


U(l) 


Accidental Global symmetry 


Baryon number 


Electron phase 


U(l) 


Accidental Global symmetry 


Electron number 


Muon phase 


U(l) 


Accidental Global symmetry 


Muon number 


Tau phase 


U(l) 


Accidental Global symmetry 


Tau number 



Standard Model 



239 



Field content of the Standard Model 



Field 
(1st generation) 


Spin 


Gauge group 
Representation 


Bar yon 
Number 


Electron 
Number 


Left-handed quark 


Ql 


1/2 


(3, 2, +1/3) 


1/3 


0 


Left-handed up antiquark 


Ut = (uviY 
""Li — V it / 


1/2 


(3, 1, -4/3) 


-1/3 


0 


Left-handed down antiquark 




1/2 


(3, 1.+2/3) 


-1/3 


0 


Left-handed lepton 




1/2 


(1,2, -1) 


0 


1 


Left-handed antielectron 


§l = (e R ) c 


1/2 


(1, l,+2) 


0 


-1 


Hypercharge gauge field 




1 


( 1, 1, 0) 


0 


0 


Isospin gauge field 


w. 


1 


(1,3,0) 


0 


0 


Gluon field 




1 


(8, 1, 0) 


0 


0 


Higgs field 


H 


0 


(1,2,+1) 


0 


0 



List of standard model fermions 

This table is based in part on data gathered by the Particle Data Group J 19 ^ 

Left-handed fermions in the Standard Model 



Generation 1 


Fermion 
(left-handed) 


Symbol 


Electric 
charge 


Weak 
isospin 


Weak 
hypercharge 


Color 
charge * 


Mass ** 


Electron 


e~ 


-1 


-1/2 


-1 


1 


511 keV 


Positron 


e + 


+ 1 


0 


+2 


1 


511 keV 


Electron neutrino 


v e 


0 


+1/2 


-1 


1 


< 2 eV **** 


Electron antineutrino 


i> e 


0 


0 


0 


1 


< 2 eV **** 


Up quark 


u 


+2/3 


+1/2 


+1/3 


3 


~ 3 MeV *** 


Up antiquark 


u 


-2/3 


0 


-4/3 


3 


~ 3 MeV *** 


Down quark 


d 


-1/3 


-1/2 


+1/3 


3 


~6MeV*** 


Down antiquark 


d 


+1/3 


0 


+2/3 


3 


~6MeV*** 




Generation 2 


Fermion 
(left-handed) 


Symbol 


Electric 
charge 


Weak 
isospin 


Weak 
hypercharge 


Color 
charge * 


Mass ** 


Muon 




-1 


-1/2 


-1 


1 


106 MeV 


Antimuon 




+ 1 


0 


+2 


1 


106 MeV 


Muon neutrino 




0 


+1/2 


-1 


1 


< 2 eV **** 


Muon antineutrino 




0 


0 


0 


1 


< 2 eV **** 


Charm quark 


c 


+2/3 


+1/2 


+1/3 


3 


~ 1.337 GeV 


Charm antiquark 


c 


-2/3 


0 


-4/3 


3 


~ 1.3 GeV 


Strange quark 


s 


-1/3 


-1/2 


+1/3 


3 


~ 100 MeV 


Strange antiquark 


s 


+1/3 


0 


+2/3 


3 


~ 100 MeV 





Standard Model 



240 



Generation 3 


Fermion 
(left-handed) 


Symbol 


Electric 
charge 


Weak 
isospin 


Weak 
hypercharge 


Color 
charge * 


Mass ** 


Tau 


T 


-1 


-1/2 


-1 


1 


1.78 GeV 


Antitau 


7 


+ 1 


0 


+2 


1 


1.78 GeV 


Tau neutrino 


V T 


0 


+1/2 


-1 


1 


< 2 eV **** 


Tau antineutrino 


V T 


0 


0 


0 


1 


< 2 eV **** 


Top quark 


t 


+2/3 


+1/2 


+1/3 


3 


171 GeV 


Top antiquark 


I 


-2/3 


0 


-4/3 


3 


171 GeV 


Bottom quark 


b 


-1/3 


-1/2 


+1/3 


3 


~ 4.2 GeV 


Bottom antiquark 


b 


+1/3 


0 


+2/3 


3 


~ 4.2 GeV 


Notes: 














• * These are not ordinary abelian charges, which can be added together, but are labels of group representations of Lie groups. 

• ** Mass is really a coupling between a left-handed fermion and a right-handed fermion. For example, the mass of an electron is really a 
coupling between a left-handed electron and a right-handed electron, which is the antiparticle of a left-handed positron. Also neutrinos show 
large mixings in their mass coupling, so it's not accurate to talk about neutrino masses in the flavor basis or to suggest a left-handed electron 
antineutrino. 

• *** The masses of baryons and hadrons and various cross-sections are the experimentally measured quantities. Since quarks can't be isolated 
because of QCD confinement, the quantity here is supposed to be the mass of the quark at the renormalization scale of the QCD scale. 

• **** The Standard Model assumes that neutrinos are massless. However, several contemporary experiments prove that neutrinos oscillate 
between their flavour states, which could not happen if all were massless. It is straightforward to extend the model to fit these data but there 
are many possibilities, so the mass eigenstates are still open. See Neutrino#Mass. 



Tests and predictions 

The Standard Model (SM) predicted 
the existence of the W and Z bosons, 
gluon, and the top and charm quarks ' 
before these particles were observed. 
Their predicted properties were 
experimentally confirmed with good 
precision. To give an idea of the 

success of the SM, the following table compares the measured masses of the W and Z bosons with the masses 
predicted by the SM: 



|e 










: 

T 




n-p n - TT 


Li 


□ 

n 


1 j 


D 

lit 


',0 


tfl 

<*> 



Log plot of masses in the Standard Model. 



Quantity 


Measured (GeV) 


SM prediction (GeV) 


Mass of W boson 


80.398 ± 0.025 


80.390 ±0.018 


Mass of Z boson 


91.1876 ±0.0021 


91.1874 ±0.0021 



The SM also makes several predictions about the decay of Z bosons, which have been experimentally confirmed by 
the Large Electron-Positron Collider at CERN. 



Standard Model 



241 



Challenges to the standard model 

There is some experimental evidence consistent with neutrinos having mass, which the Standard Model does not 
allow. To accommodate such findings, the Standard Model can be modified by adding a non-renormalizable 
interaction of lepton fields with the square of the Higgs field. This is natural in certain grand unified theories, and if 
new physics appears at about 10 16 GeV, the neutrino masses are of the right order of magnitude. 

Currently, there is one elementary particle predicted by the Standard Model that has yet to be observed: the Higgs 
boson. A major reason for building the Large Hadron Collider is that the high energies of which it is capable are 
expected to make the Higgs observable. However, as of August 2008, there is only indirect empirical evidence for 
the existence of the Higgs boson, so that its discovery cannot be claimed. Moreover, there are serious theoretical 
reasons for supposing that elementary scalar Higgs particles cannot exist (see Quantum triviality). 

A fair amount of theoretical and experimental research has attempted to extend the Standard Model into a Unified 
Field Theory or a Theory of everything, a complete theory explaining all physical phenomena including constants. 
Inadequacies of the Standard Model that motivate such research include: 

• It does not attempt to explain gravitation, and unlike for the strong and electroweak interactions of the Standard 
Model, there is no known way of describing general relativity, the canonical theory of gravitation, consistently in 
terms of quantum field theory. The reason for this is among other things that quantum field theories of gravity 
generally break down before reaching the Planck scale. As a consequence, we have no reliable theory for the very 
early universe; 

• It seems rather ad-hoc and inelegant, requiring 19 numerical constants whose values are unrelated and arbitrary. 
Although the Standard Model, as it now stands, can explain why neutrinos have masses, the specifics of neutrino 
mass are still unclear. It is believed that explaining neutrino mass will require an additional 7 or 8 constants, 
which are also arbitrary parameters; 

• The Higgs mechanism gives rise to the hierarchy problem if any new physics (such as quantum gravity) is present 
at high energy scales. In order for the weak scale to be much smaller than the Planck scale, severe fine tuning of 
Standard Model parameters is required; 

• It should be modified so as to be consistent with the emerging "standard model of cosmology." In particular, the 
Standard Model cannot explain the observed amount of cold dark matter (CDM) and gives contributions to dark 
energy which are far too large. It is also difficult to accommodate the observed predominance of matter over 
antimatter (matter/antimatter asymmetry). The isotropy and homogeneity of the visible universe over large 
distances seems to require a mechanism like cosmic inflation, which would also constitute an extension of the 
Standard Model. 

Currently no proposed Theory of everything has been conclusively verified. 

Notes and references 

Notes 

[1] S.L. Glashow (1961). "Partial- symmetries of weak interactions". Nuclear Physics 22: 579-588. doi: 10. 1016/0029-5582(61)90469-2. 
[2] S. Weinberg (1967). "A Model of Leptons". Physical Review Letters 19: 1264-1266. doi:10.1103/PhysRevLett.l9.1264. 
[3] A. Salam (1968). N. Svartholm. ed. Elementary Particle Physics: Relativistic Groups and Analyticity. Eighth Nobel Symposium. Stockholm: 
Almquvist and Wiksell. pp. 367. 

[4] F. Englert, R. Brout (1964). "Broken Symmetry and the Mass of Gauge Vector Mesons". Physical Review Letters 13: 321-323. 

doi: 1 0. 1 1 03/Phy sRevLett. 13.321. 
[5] P.W. Higgs (1964). "Broken Symmetries and the Masses of Gauge Bosons". Physical Review Letters 13: 508-509. 

doi: 1 0. 1 1 03/Phy sRevLett. 13.508. 

[6] G.S. Guralnik, C.R. Hagen, T.W.B. Kibble (1964). "Global Conservation Laws and Massless Particles". Physical Review Letters 13: 

585-587. doi:10.1103/PhysRevLett.l3.585. 
[7] F.J. Hasert et al. (1973). "Search for elastic muon-neutrino electron scattering". Physics Letters B 46: 121. 

doi: 10. 1016/0370-2693(73)90494-2. 



Standard Model 



242 



[8] FJ. Hasert et al. (1973). "Observation of neutrino-like interactions without muon or electron in the gargamelle neutrino experiment". Physics 

Letters B 46: 138. doi:10.1016/0370-2693(73)90499-l. 
[9] F.J. Hasert et al. (1974). "Observation of neutrino-like interactions without muon or electron in the Gargamelle neutrino experiment". Nuclear 

Physics B 73: 1. doi:10.1016/0550-3213(74)90038-8. 
[10] D. Haidt (4 October 2004). "The discovery of the weak neutral currents" (http://cerncourier.com/cws/article/cern/29168). CERN 

Courier. . Retrieved 2008-05-08. 

[11] "Details can be worked out if the situation is simple enough for us to make an approximation, which is almost never, but often we can 

understand more or less what is happening." from The Feynman Lectures on Physics, Vol 1. pp. 2-7 
[12] Technically, there are nine such color-anticolor combinations. However there is one color symmetric combination that can be constructed 

out of a linear superposition of the nine combinations, reducing the count to eight. 
[13] F. Englert, R. Brout (1964). "Broken Symmetry and the Mass of Gauge Vector Mesons". Physical Review Letters 13: 321-323. 

doi: 1 0. 1 1 03/Phy sRevLett. 13.321. 
[14] P.W. Higgs (1964). "Broken Symmetries and the Masses of Gauge Bosons". Physical Review Letters 13: 508-509. 

doi: 1 0. 1 1 03/Phy sRevLett. 13.508. 

[15] G.S. Guralnik, C.R. Hagen, T.W.B. Kibble (1964). "Global Conservation Laws and Massless Particles". Physical Review Letters 13: 

585-587. doi:10.1103/PhysRevLett.l3.585. 
[16] G.S. Guralnik (2009). "The History of the Guralnik, Hagen and Kibble development of the Theory of Spontaneous Symmetry Breaking and 

Gauge Particles". International Journal of Modern Physics A 24: 2601-2627. doi: 10. 1142/S021775 1X09045431. arXiv:0907.3466. 
[17] A. Cho (23 January 2008). "Higgs Hiding in Plain Sight?" (http://sciencenow.sciencemag.Org/cgi/content/full/2008/123/3). 

ScienceNOW. . Retrieved 2008-05-08. 
[18] The normalization Q = I + Y is sometimes used instead. 

[19] W.-M. Yao et al. (Particle Data Group) (2006). "Review of Particle Physics: Quarks" (http://pdg.lbl.gov/2006/tables/qxxx.pdf). Journal 

of Physics G 33: 1. doi: 10. 1088/0954-3899/33/1/001. . 
[20] W.-M. Yao et al. (Particle Data Group) (2006). "Review of Particle Physics: Neutrino mass, mixing, and flavor change" (http://pdg.lbl. 

gov/2007/reviews/numixrpp.pdf). Journal of Physics G 33: 1. . 
[21] http://press.web.cern.ch/press/PressReleases/Releases2010/PR08.10E.html 

References 

Further reading 

• R. Oerter (2006). The Theory of Almost Everything: The Standard Model, the Unsung Triumph of Modern 
Physics. Plume. 

• B.A. Schumm (2004). Deep Down Things: The Breathtaking Beauty of Particle Physics. Johns Hopkins 
University Press. ISBN 0-8018-7971-X. 

• V. Stenger (2000). Timeless Reality. Prometheus Books. See chapters 9-12 in particular. 

Introductory textbooks 

• I. Aitchison, A. Hey (2003). Gauge Theories in Particle Physics: A Practical Introduction.. Institute of Physics. 
ISBN 9780585445502. 

• W. Greiner, B. Muller (2000). Gauge Theory of Weak Interactions. Springer. ISBN 3-540-67672-4. 

• G.D. Coughlan, J.E. Dodd, B.M. Gripaios (2006). The Ideas of Particle Physics: An Introduction for Scientists. 
Cambridge University Press. 

• D.J. Griffiths (1987). Introduction to Elementary Particles. John Wiley & Sons. ISBN 0-471-60386-4. 

• G.L. Kane (1987). Modern Elementary Particle Physics. Perseus Books. ISBN 0-201-1 1749-5. 

Advanced textbooks 

• T.P. Cheng, L.F. Li (2006). Gauge theory of elementary particle physics. Oxford University Press. 
ISBN 0-19-851961-3. Highlights the gauge theory aspects of the Standard Model. 

• J.F. Donoghue, E. Golowich, B.R. Holstein (1994). Dynamics of the Standard Model. Cambridge University 
Press. ISBN 978-0521476522. Highlights dynamical and phenomenological aspects of the Standard Model. 

• L. O'Raifeartaigh (1988). Group structure of gauge theories. Cambridge University Press. ISBN 0-521-34785-8. 
Highlights group-theoretical aspects of the Standard Model. 

Journal articles 



Standard Model 



243 



• E.S. Abers, B.W. Lee (1973). "Gauge theories". Physics Reports 9: 1-141. doi: 10. 1016/0370-1573(73)90027-6. 

• Y. Hayato et al (1999). "Search for Proton Decay through p —> vK + in a Large Water Cherenkov Detector". 
Physical Review Letters 83: 1529. doi:10.1103/PhysRevLett.83.1529. 

• S.F. Novaes (2000). "Standard Model: An Introduction". arXiv:hep-ph/0001283 [hep-ph]. 

• D.P. Roy (1999). "Basic Constituents of Matter and their Interactions — A Progress Report.". 
arXiv:hep-ph/99 12523 [hep-ph]. 

• F. Wilczek (2004). "The Universe Is A Strange Place". arXiv:astro-ph/0401347 [astro-ph]. 

External links 

• " Standard Model - explanation for beginners (http://cms.web.cern.ch/cms/Physics/StandardPackage/index. 
html)" LHC 

• " Standard Model may be found incomplete, (http://www.newscientist.com/news/news.jsp ?id=ns9999404)" 
New Scientist. 

• " Observation of the Top Quark (http://www-cdf.fnal.gov/top_status/top.html)" at Fermilab. 

• " The Standard Model Lagrangian. (http://cosmicvariance.eom/2006/l 1/23/thanksgiving)" After electroweak 
symmetry breaking, with no explicit Higgs boson. 

• " Standard Model Lagrangian (http://nuclear.ucdavis.edu/~tgutierr/files/stmLl.html)" with explicit Higgs 
terms. PDF, PostScript, and LaTeX versions. 

• ' ' The particle adventure . (http ://particleadventure . org/) ' ' Web tutorial . 

• Nobes, Matthew (2002) "Introduction to the Standard Model of Particle Physics" on Kuro5hin: Part 1, (http:// 
www.kuro5hin.org/story/2002/5/l/37 12/3 1700) Part 2, (http://www.kuro5hin.Org/story/2002/5/14/ 
19363/8142) Part 3a, (http://www.kuro5hin.Org/story/2002/7/15/173318/784) Part 3b. (http://www. 
kuro5hin.org/story/2002/8/21/195035/576) 



Topological quantum field theory 



244 



Topological quantum field theory 



A topological quantum field theory (or topological field theory or TQFT) is a quantum field theory which 
computes topological invariants. 

Although TQFTs were invented by physicists, they are also of mathematical interest, being related to, among other 
things, knot theory and the theory of four-manifolds in algebraic topology, and to the theory of moduli spaces in 
algebraic geometry. Donaldson, Jones, Witten, and Kontsevich have all won Fields Medals for work related to 
topological field theory. 

In condensed matter physics, topological quantum field theories are the low energy effective theories of 
topologically ordered states, such as fractional quantum Hall states, string-net condensed states, and other strongly 
correlated quantum liquid states. 



In a topological field theory, the correlation functions do not depend on the metric on spacetime. This means that the 
theory is not sensitive to changes in the shape of spacetime; if the spacetime warps or contracts, the correlation 
functions do not change. Consequently, they are topological invariants. 

Topological field theories are not very interesting on the flat Minkowski spacetime used in particle physics. 
Minkowski space can be contracted to a point, so a TQFT on Minkowski space computes only trivial topological 
invariants. Consequently, TQFTs are usually studied on curved spacetimes, such as, for example, Riemann surfaces. 
Most of the known topological field theories are defined on spacetimes of dimension less than five. It seems that a 
few higher dimensional theories exist, but they are not very well understood. 

Quantum gravity is believed to be background-independent (in some suitable sense), and TQFTs provide examples 
of background independent quantum field theories. This has prompted ongoing theoretical investigation of this class 
of models. 

(Caveat: It is often said that TQFTs have only finitely many degrees of freedom. This is not a fundamental property. 
It happens to be true in most of the examples that physicists and mathematicians study, but it is not necessary. A 
topological sigma model with target infinite-dimensional projective space, if such a thing could be defined, would 
have countably infinitely many degrees of freedom.) 

Specific models 

The known topological field theories fall into two general classes: Schwarz-type TQFTs and Witten-type TQFTs. 
Witten TQFTs are also sometimes referred to as cohomological field theories. 

Schwarz-type TQFTs 

In Schwarz-type TQFTs, the correlation functions computed by the path integral are topological invariants because 
the path integral measure and the quantum field observables are explicitly independent of the metric. For instance, in 
the BF model, the spacetime is a two-dimensional manifold M, the observables are constructed from a two-form F, 
an auxiliary scalar B, and their derivatives. The action (which determines the path integral) is 



Jm 

The spacetime metric does not appear anywhere in this theory, so the theory is explicitly topologically invariant. 
Another, more famous example is Chern-Simons theory, which can be used to compute knot invariants. 



Overview 




Topological quantum field theory 



245 



Witten-type TQFTs 

In Witten-type topological field theories, the topological in variance is more subtle. For example the Lagrangian for 
the WZW model does depend explicitly on the metric, but one shows by calculation that the expectation value of the 
partition function and a special class of correlation functions are in fact diffeomorphism invariant. 

Mathematical formulations 

Atiyah-Segal axioms 

Atiyah suggested a set of axioms for topological quantum field theory which was inspired by Segal's proposed 
axioms for conformal field theory, (Atiyah 1988). These axioms have been relatively useful for mathematical 
treatments of Schwarz-type QFTs, although it isn't clear that they capture the whole structure of Witten-type QFTs. 
The basic idea is that a TQFT is a functor from a certain category of cobordisms to the category of vector spaces. 

There are in fact two different sets of axioms which could reasonably be called the Atiyah axioms. These axioms 
differ basically in whether or not they study a TQFT defined on a single fixed ^-dimensional Riemannian / 
Lorentzian spacetime Mora TQFT defined on all ^-dimensional spacetimes at once. 

[ed. What follows is still in rough draft form and should be regarded suspiciously.] 

The case of a fixed spacetime 

Let BordM^e the category whose morphisms are ^-dimensional submanifolds of M and whose objects are 
connected components of the boundaries of such submanifolds. Regard two morphisms as equivalent if they are 
homotopic via submanifolds of M, and so form the quotient category hBordM : The objects in hB or dj^ are the 
objects of Bordjtf' an d the morphisms of /^Bord/^ are homotopy equivalence classes of morphisms in BordM 

A TQFT on M is a symmetric monoidal functor from HBordM^ 0 me category of vector spaces. 
Note that cobordisms can, if their boundaries match up, be sewn together to form a new bordism. This is the 
composition law for morphisms in the cobordism category. Since functors are required to preserve composition, this 
says that the linear map corresponding to a sewn together morphism is just the composition of the linear map for 
each piece. 

There is an equivalence of categories between the category of 2-dimensional topological quantum field theories and 
the category of commutative Frobenius algebras. 

All n-dimensional spacetimes at once 

To consider all spacetimes at once, it is necessary to replace 
hBordM^y a larger category. So let Bord n be the category of 
bordisms, i.e. the category whose morphisms are n-dimensional 
manifolds with boundary, and whose objects are the connected 
components of the boundaries of n-dimensional manifolds. (Note that 
any (n — 1) -dimensional manifold may appear as an object in 
Bord n .) As above, regard two morphisms in Bord n as equivalent 
if they are homotopic, and form the quotient category hBord n . 
Bord n is a monoidal category under the operation which takes two 
bordisms to the bordism made from their disjoint union. A TQFT on 
n-dimensional manifolds is then a functor from hBord n to the 
category of vector spaces, which takes disjoint unions of bordisms to 
the tensor product f [ed. unfinished] 




The pair of pants is a (l+l)-dimensional bordism, 
which corresponds to a product or coproduct in a 
2-dimensional TQFT. 



Topological quantum field theory 



246 



For example, for (l+l)-dimensional bordisms (2-dimensional bordisms between 1 -dimensional manifolds), the map 
associated with a pair of pants gives a product or coproduct, depending on how the boundary components are 
grouped - which is commutative or cocommutative, while the map associated with a disk gives a counit (trace) or 
unit (scalars), depending on grouping of boundary, and thus (l+l)-dimension TQFTs correspond to Frobenius 
algebras. 

Generalizations 

For some applications, it is convenient to demand extra topological structure on the morphisms, such as a choice of 
orientation. 

References 

• Atiyah, Michael (1988), "Topological quantum field theories" Publications Mathematiques de VIHES 68 (68): 
175-186, doi:10.1007/BF02698547, MR1001453 

• Lurie, Jacob, On the Classification of Topological Field Theories 

• Witten, Edward (1988), "Topological quantum field theory" , Communications in Mathematical Physics 111 
(3): 353-386, doi:10.1007/BF0 1223 371, MR953828 

Schwarz' original paper introducing the ideas of TQFT's, in which he produces a Ray-Singer invariant from a QFT 
functional: 

Mi 

• Schwarz, Albert (1979), "The partition function of a degenerate functional" , Communications in Mathematical 
Physics 67 (1): 1-16, doi:10.1007/BF01223197 

References 

[1] http://www.numdam.org/item?id=PMIHES_1988_68_175_0 

[2] http://www-math.mit.edu/~lurie/papers/cobordism.pdf 

[3] http://projecteuclid.Org/euclid.cmp/l 104161738 

[4] http://www . springerlink. com/content/n5 w7 85042u 1 2 1 628/ 



Quantum Chromodynamics 



247 



Quantum Chromodynamics 

In theoretical physics, quantum chromodynamics (QCD) is a theory of the strong interaction (color force), a 
fundamental force describing the interactions of the quarks and gluons making up hadrons (such as the proton, 
neutron or pion). It is the study of the SU(3) Yang-Mills theory of color-charged fermions (the quarks). QCD is a 
quantum field theory of a special kind called a non-abelian gauge theory. It is an important part of the Standard 
Model of particle physics. A huge body of experimental evidence for QCD has been gathered over the years. 

QCD enjoys two peculiar properties: 

• Confinement, which means that the force between quarks does not diminish as they are separated. Because of 
this, it would take an infinite amount of energy to separate two quarks; they are forever bound into hadrons such 
as the proton and the neutron. Although analytically unproven, confinement is widely believed to be true because 
it explains the consistent failure of free quark searches, and it is easy to demonstrate in lattice QCD. 

• Asymptotic freedom, which means that in very high-energy reactions, quarks and gluons interact very weakly. 
This prediction of QCD was first discovered in the early 1970s by David Politzer and by Frank Wilczek and 
David Gross. For this work they were awarded the 2004 Nobel Prize in Physics. 

There is no known phase-transition line separating these two properties; confinement is dominant in low-energy 
scales but, as energy increases, asymptotic freedom becomes dominant. 

Terminology 

The word quark was coined by American physicist Murray Gell-Mann (b. 1929) in its present sense. It originally 
comes from the phrase "Three quarks for Muster Mark" in Finnegans Wake by James Joyce. On June 27, 1978, 
Gell-Mann wrote a private letter to the editor of the Oxford English Dictionary, in which he related that he had been 
influenced by Joyce's words: "The allusion to three quarks seemed perfect." (Originally, only three quarks had been 
discovered.) Gell-Mann, however, wanted to pronounce the word with (6) not (a), as Joyce seemed to indicate by 
rhyming words in the vicinity such as Mark. Gell-Mann got around that "by supposing that one ingredient of the line 
'Three quarks for Muster Mark' was a cry of 'Three quarts for Mister . . . ' heard in H.C. Earwicker's pub," a plausible 
suggestion given the complex punning in Joyce's novel. ^ 

The three kinds of charge in QCD (as opposed to one in quantum electrodynamics or QED) are usually referred to as 
"color charge" by loose analogy to the three kinds of color (red, green and blue) perceived by humans. Other than 
this "clever" nomenclature, the quantum parameter "color" is completely unrelated to the everyday, familiar 
phenomenon of color. 

Since the theory of electric charge is dubbed "electrodynamics", the Greek word "chroma" Xpcojia (meaning color) 
is applied to the theory of color charge, "chromodynamics". 

History 

With the invention of bubble chambers and spark chambers in the 1950s, experimental particle physics discovered a 
large and ever-growing number of particles called hadrons. It seemed that such a large number of particles could not 
all be fundamental. First, the particles were classified by charge and isospin by Eugene Wigner and Werner 
Heisenberg; then, in 1953, according to strangeness by Murray Gell-Mann and Kazuhiko Nishijima. To gain greater 
insight, the hadrons were sorted into groups having similar properties and masses using the eightfold way, invented 
in 1961 by Gell-Mann and Yuval Ne'eman. Gell-Mann and George Zweig, correcting an earlier approach of Shoichi 
Sakata, went on to propose in 1963 that the structure of the groups could be explained by the existence of three 
flavours of smaller particles inside the hadrons: the quarks. 



Quantum Chromodynamics 



248 



Perhaps the first remark that quarks should possess an additional quantum number was made as a short footnote in 
the preprint of Boris Struminsky 1 in connection with Q- hyperon composed of three strange quarks with parallel 
spins (this situation was peculiar, because since quarks are fermions, such combination is forbidden by the Pauli 
exclusion principle): 

Three identical quarks cannot form an antisymmetric S-state. In order to realize an antisymmetric orbital 
S -state, it is necessary for the quark to have an additional quantum number. 

- B. V. Struminsky, Magnetic moments ofbarions in the quark model, JINR-Preprint P-1939, Dubna, 

Submitted on January 7, 1965 

Boris Struminsky was a PhD student of Nikolay Bogolyubov. The problem considered in this preprint was suggested 

by Nikolay Bogolyubov, who advised Boris Struminsky in this research. In the beginning of 1965, Nikolay 

Bogolyubov, Boris Struminsky and Albert Tavchelidze wrote a preprint with a more detailed discussion of the 

Mi 

additional quark quantum degree of freedom. This work was also presented by Albert Tavchelidze without 
obtaining consent of his collaborators for doing so at an international conference in Trieste (Italy), in May 1965 ^ 

A similar mysterious situation was with the A ++ baryon; in the quark model, it is composed of three up quarks with 
parallel spins. In 1965, Moo- Young Han with Yoichiro Nambu and Oscar W. Greenberg independently resolved the 
problem by proposing that quarks possess an additional SU(3) gauge degree of freedom, later called color charge. 
Han and Nambu noted that quarks might interact via an octet of vector gauge bosons: the gluons. 

Since free quark searches consistently failed to turn up any evidence for the new particles, and because an 
elementary particle back then was defined as a particle which could be separated and isolated, Gell-Mann often said 
that quarks were merely convenient mathematical constructs, not real particles. The meaning of this statement was 
usually clear in context: He meant quarks are confined, but he also was implying that the strong interactions could 
probably not be fully described by quantum field theory. 

Richard Feynman argued that high energy experiments showed quarks are real particles: he called them partons 
(since they were parts of hadrons). By particles, Feynman meant objects which travel along paths, elementary 
particles in a field theory. 

The difference between Feynman's and Gell-Mann's approaches reflected a deep split in the theoretical physics 
community. Feynman thought the quarks have a distribution of position or momentum, like any other particle, and 
he (correctly) believed that the diffusion of parton momentum explained diffractive scattering. Although Gell-Mann 
believed that certain quark charges could be localized, he was open to the possibility that the quarks themselves 
could not be localized because space and time break down. This was the more radical approach of S-matrix theory. 

James Bjorken proposed that pointlike partons would imply certain relations should hold in deep inelastic scattering 
of electrons and protons, which were spectacularly verified in experiments at SLAC in 1969. This led physicists to 
abandon the S-matrix approach for the strong interactions. 

The discovery of asymptotic freedom in the strong interactions by David Gross, David Politzer and Frank Wilczek 
allowed physicists to make precise predictions of the results of many high energy experiments using the quantum 
field theory technique of perturbation theory. Evidence of gluons was discovered in three jet events at PETRA in 
1979. These experiments became more and more precise, culminating in the verification of perturbative QCD at the 
level of a few percent at the LEP in CERN. 

The other side of asymptotic freedom is confinement. Since the force between color charges does not decrease with 
distance, it is believed that quarks and gluons can never be liberated from hadrons. This aspect of the theory is 
verified within lattice QCD computations, but is not mathematically proven. One of the Millennium Prize Problems 
announced by the Clay Mathematics Institute requires a claimant to produce such a proof. Other aspects of 
non-perturbative QCD are the exploration of phases of quark matter, including the quark-gluon plasma. 

The relation between the short-distance particle limit and the confining long-distance limit is one of the topics 
recently explored using string theory, the modern form of S-matrix theory ^ 



Quantum Chromodynamics 



249 



Theory 

Some definitions 

Every field theory of particle physics is based on certain symmetries of nature whose existence is deduced from 
observations. These can be 

• local symmetries, that is the symmetry acts independently at each point in space-time. Each such symmetry is the 
basis of a gauge theory and requires the introduction of its own gauge bosons. 

• global symmetries, which are symmetries whose operations must be simultaneously applied to all points of 
space-time. 

QCD is a gauge theory of the SU(3) gauge group obtained by taking the color charge to define a local symmetry. 

Since the strong interaction does not discriminate between different flavors of quark, QCD has approximate flavor 
symmetry, which is broken by the differing masses of the quarks. 

There are additional global symmetries whose definitions require the notion of chirality, discrimination between left 
and right-handed. If the spin of a particle has a positive projection on its direction of motion then it is called 
left-handed; otherwise, it is right-handed. Chirality and handedness are not the same, but become approximately 
equivalent at high energies. 

• Chiral symmetries involve independent transformations of these two types of particle. 

• Vector symmetries (also called diagonal symmetries) mean the same transformation is applied on the two 
chiralities. 

• Axial symmetries are those in which one transformation is applied on left-handed particles and the inverse on the 
right-handed particles. 

Additional remarks: duality 

As mentioned, asymptotic freedom means that at large energy - this corresponds also to short distances - there is 
practically no interaction between the particles. This is in contrast - more precisely one would say: dual - to what one 
is used to, since usually one connects the absence of interactions with large distances. However, as already 
mentioned in the original paper of Franz WegnerJ 9 ^ a solid state theorist who introduced 1971 simple gauge 
invariant lattice models, the high-temperature behaviour of the original model, e.g. the strong decay of correlations 
at large distances, corresponds to the low-temperature behaviour of the (usually ordered!) dual model, namely the 
asymptotic decay of non-trivial correlations, e.g. short-range deviations from almost perfect arrangements, for short 
distances. Here, in contrast to Wegner, we have only the dual model, which is that one described in this article/ 10 ^ 

Symmetry groups 

The color group SU(3) corresponds to the local symmetry whose gauging gives rise to QCD. The electric charge 
labels a representation of the local symmetry group U(l) which is gauged to give QED: this is an abelian group. If 
one considers a version of QCD with N f flavors of massless quarks, then there is a global (chiral) flavor symmetry 
group SU L (N f ) X SU R (N f ) X U B (1) X U A (1). The chiral symmetry is spontaneously broken by the QCD 
vacuum to the vector (L+R) SUy (Nf) with the formation of a chiral condensate. The vector symmetry, Uq(1) 
corresponds to the baryon number of quarks and is an exact symmetry. The axial symmetry U^(l)is exact in the 
classical theory, but broken in the quantum theory, an occurrence called an anomaly. Gluon field configurations 
called instantons are closely related to this anomaly. 

There are two different types of SU(3) symmetry: there is the symmetry that acts on the different colors of quarks, 
and this is an exact gauge symmetry mediated by the gluons, and there is also a flavor symmetry which rotates 
different flavors of quarks to each other, or flavor SU(3). Flavor SU(3) is an approximate symmetry of the vacuum of 
QCD, and is not a fundamental symmetry at all. It is an accidental consequence of the small mass of the three 



Quantum Chromodynamics 



250 



lightest quarks. 

In the QCD vacuum there are vacuum condensates of all the quarks whose mass is less than the QCD scale. This 
includes the up and down quarks, and to a lesser extent the strange quark, but not any of the others. The vacuum is 
symmetric under SU(2) isospin rotations of up and down, and to a lesser extent under rotations of up, down and 
strange, or full flavor group SU(3), and the observed particles make isospin and SU(3) multiplets. 

The approximate flavor symmetries do have associated gauge bosons, observed particles like the rho and the omega, 
but these particles are nothing like the gluons and they are not massless. They are emergent gauge bosons in an 
approximate string description of QCD. 

Lagrangian 

The dynamics of the quarks and gluons are controlled by the quantum chromodynamics Lagrangian. The gauge 
invariant QCD Lagrangian is 

£ QCD = * tyfiP^ - m <y 4, - Jg^gt 

where ^(x) is the quark field, a dynamical function of space-time, in the fundamental representation of the SU(3) 
gauge group, indexed by i, j 5 . . .; G^(:r)are the gluon fields, also a dynamical function of space-time, in the 
adjoint representation of the SU(3) gauge group, indexed by a, 6, ... . The 7^ are Dirac matrices connecting the 
spinor representation to the vector representation of the Lorentz group; and Tjj are the generators connecting the 

fundamental, antifundamental and adjoint representations of the SU(3) gauge group. The Gell-Mann matrices 
provide one such representation for the generators. 

The symbol G^ v represents the gauge invariant gluonic field strength tensor, analogous to the electromagnetic field 
strength tensor, , in Electrodynamics. It is given by 

G% = d^G* - d v G; - gf^Gffi , 
where f abc are the structure constants of SU(3). Note that the rules to move-up or pull-down the a, b, or c indexes 

are trivial, (+ +), so that f abc = f abc = f£ c } whereas for the fi or v indexes one has the non-trivial relativistic 

rules, corresponding e.g. to the signature (+ — ). Furthermore, for mathematicians, according to this formula the 
gluon colour field can be represented by a SU(3)-Lie algebra-valued " curvature M -2-form G = dG — g G A G , 

where G* s a " vector potential"- 1 -form corresponding to G an d A is the (antisymmetric) "wedge product" of this 
algebra, producing the "structure constants" f abc . The Cartan-derivative of the field form (i.e. essentially the 
divergence of the field) would be zero in the absence of the "gluon terms", i.e. those ~ g, which represent the 

*Kie cons^fanfs^m ari§ °<? contrH^me quark mass and coupling constants of the theory, subject to renormalization in 
the full quantum theory. 

An important theoretical notion concerning the final term of the above Lagrangian is the Wilson loop variable. This 
loop variable plays a most-important role in discretized forms of the QCD (see lattice QCD), and more generally, it 
distinguishes confined and deconfined states of a gauge theory. It was introduced by the Nobel prize winner Kenneth 
G. Wilson and is treated in a separate article. 



Quantum Chromodynamics 



251 



Fields 

Quarks are massive spin- 1/2 fermions which carry a color charge whose gauging is the content of QCD. Quarks are 
represented by Dirac fields in the fundamental representation 3 of the gauge group SU(3). They also carry electric 
charge (either -1/3 or 2/3) and participate in weak interactions as part of weak isospin doublets. They carry global 
quantum numbers including the baryon number, which is 1/3 for each quark, hypercharge and one of the flavor 
quantum numbers. 

Gluons are spin-1 bosons which also carry color charges, since they lie in the adjoint representation 8 of SU(3). They 
have no electric charge, do not participate in the weak interactions, and have no flavor. They lie in the singlet 
representation 1 of all these symmetry groups. 

Every quark has its own antiquark. The charge of each antiquark is exactly the opposite of the corresponding quark. 
Dynamics 

According to the rules of quantum field theory, and the associated Feynman diagrams, the above theory gives rise to 
three basic interactions: a quark may emit (or absorb) a gluon, a gluon may emit (or absorb) a gluon, and two gluons 
may directly interact. This contrasts with QED, in which only the first kind of interaction occurs, since photons have 
no charge. Diagrams involving Faddeev-Popov ghosts must be considered too. 

Area law and confinement 

Detailed computations with the above-mentioned Lagrangian^ 11 ^ show that the effective potential between a quark 

and its anti-quark in a meson contains a term OC r , which represents some kind of "stiffness" of the interaction 

between the particle and its anti-particle at large distances, similar to the entropic elasticity of a rubber band (see 

below). This leads to confinement of the quarks to the interiour of hadrons, i.e. mesons and nucleons, with typical 

ri3i 

radii R , corresponding to former "Bag models" of the hadrons . The order of magnitude of the "bag radius" is 1 
C -15 

fm (=10 m). Moreover, the above-mentioned stiffness is quantitatively related to the so-called "area law" 
behaviour of the expectation value of the Wilson loop product Pyyof the ordered coupling constants around a 
closed loop W; i.e. (Pw) * s proportional to the area enclosed by the loop. For this behaviour the non-abelian 
behaviour of the gauge group is essential. 

Methods 

Further analysis of the content of the theory is complicated. Various techniques have been developed to work with 
QCD. Some of them are discussed briefly below. 

Perturbative QCD 

This approach is based on asymptotic freedom, which allows perturbation theory to be used accurately in 
experiments performed at very high energies. Although limited in scope, this approach has resulted in the most 
precise tests of QCD to date. 

Lattice QCD 

Among non-perturbative approaches to QCD, the most well established one is lattice QCD. This approach uses a 
discrete set of space-time points (called the lattice) to reduce the analytically intractable path integrals of the 
continuum theory to a very difficult numerical computation which is then carried out on supercomputers like the 
QCDOC which was constructed for precisely this purpose. While it is a slow and resource-intensive approach, it has 
wide applicability, giving insight into parts of the theory inaccessible by other means. However, the numerical sign 
problem makes it difficult to use lattice methods to study QCD at high density and low temperature (e.g. nuclear 
matter or the interior of neutron stars). 



Quantum Chromodynamics 



252 



1/N expansion 

A well-known approximation scheme, the 1/N expansion, starts from the premise that the number of colors is 
infinite, and makes a series of corrections to account for the fact that it is not. Until now it has been the source of 
qualitative insight rather than a method for quantitative predictions. Modern variants include the AdS/CFT approach. 

Effective theories 

For specific problems effective theories may be written down which give qualitatively correct results in certain 
limits. In the best of cases, these may then be obtained as systematic expansions in some parameter of the QCD 
Lagrangian. One such effective field theory is chiral perturbation theory or ChiPT, which is the QCD effective 
theory at low energies. More precisely, it is a low energy expansion based on the spontaneus chiral symmetry 
breaking of QCD, which is an exact symmetry when quark masses are equal to zero, but for the u,d and s quark, 
which have small mass, it is still a good approximate symmetry. Depending on the number of quarks which are 
treated as light, one uses either SU(2) ChiPT or SU(3) ChiPT . Other effective theories are heavy quark effective 
theory (which expands around heavy quark mass near infinity), and soft-collinear effective theory (which expands 
around large ratios of energy scales). In addition to effective theories, models like the Nambu-Jona-Lasinio model 
and the chiral model are often used when discussing general features. 

QCD Sum Rules 

Based on an Operator product expansion one can derive sets of relations that connect different observables with each 
other. 

Experimental tests 

The notion of quark flavours was prompted by the necessity of explaining the properties of hadrons during the 
development of the quark model. The notion of colour was necessitated by the puzzle of the A ++ . This has been dealt 
with in the section on the history of QCD. 

The first evidence for quarks as real constituent elements of hadrons was obtained in deep inelastic scattering 
experiments at SLAC. The first evidence for gluons came in three jet events at PETRA. 

Good quantitative tests of perturbative QCD are 

• the running of the QCD coupling as deduced from many observations 

• scaling violation in polarized and unpolarized deep inelastic scattering 

• vector boson production at colliders (this includes the Drell-Yan process) 

• jet cross sections in colliders 

• event shape observables at the LEP 

• heavy-quark production in colliders 

Quantitative tests of non-perturbative QCD are fewer, because the predictions are harder to make. The best is 
probably the running of the QCD coupling as probed through lattice computations of heavy-quarkonium spectra. 
There is a recent claim about the mass of the heavy meson B c [14]. Other non-perturbative tests are currently at the 
level of 5% at best. Continuing work on masses and form factors of hadrons and their weak matrix elements are 
promising candidates for future quantitative tests. The whole subject of quark matter and the quark-gluon plasma is a 
non-perturbative test bed for QCD which still remains to be properly exploited. 



Quantum Chromodynamics 



253 



Cross-relations to Solid State Physics 

There are unexpected cross-relations to solid state physics. For example, the notion of gauge invariance forms the 
basis of the well-known Mattis spin glasses J 15 ^ which are systems with the usual spin degrees of freedom Si = ±1 
for /=1,...,N, with the special fixed "random" couplings J^ = Jo e^. Here the e. and e fc quantities can 
independently and "randomly" take the values ±1 , which corresponds to a most- simple gauge transformation 
(si — > Si • 6i Jik — > £iJi,k£k s k —> s k ' £k) - This means that thermodynamic expectation values of 
measurable quantities, e.g. of the energy *)-(,:= — ^ ^ s { J i k s fc , are invariant. 

However, here the coupling degrees of freedom Ji^ , which in the QCD correspond to the gluons, are "frozen" to 
fixed values (quenching). In contrast, in the QCD they "fluctuate" (annealing), and through the large number of 
gauge degrees of freedom the entropy plays an important role (see below). 

For positive Jo the thermodynamics of the Mattis spin glass corresponds in fact simply to a ferromagnet, just 
because these systems have no "frustration" at all. This term is a basic measure in spin glass theory J 16 ^ 
Quantitatively it is identical with the loop-product Pw ' — Ji,kJk,l- ■ ■ Jn.mJTn.i along a closed loop W. However, 

for a Mattis spin glass - in contrast to "genuine" spin glasses - the quantity never becomes negative. 
The basic notion "frustration" of the spin-glass is actually similar to the Wilson loop quantity of the QCD. The only 
difference is again that in the QCD one is dealing with SU(3) matrices, and that one is dealing with a "fluctuating" 
quantity. Energetically, perfect absence of frustration should be non-favorable and untypical for a spin glass, which 
means that one should add the loop-product to the Hamiltonian, by some kind of term representing a "punishment". - 
In the QCD the Wilson loop is essential for the Lagrangian rightaway. 

The relation between the QCD and "disordered magnetic systems" (the spin glasses belong to them) were 

ri7i 

additionally stressed in a paper by Fradkin, Huberman und Shenker, which also stresses the notion of duality. 

A further analogy consists in the already mentioned similarity to polymer physics, where, analogously to Wilson 

Loops, so-called "entangled nets" appear, which are important for the formation of the entropy-elasticity (force 

proportional to the length) of a rubber band. The non-abelian character of the SU(3) corresponds thereby to the 

non-trivial "chemical links", which glue different loop segments together, and "asymptotic freedom" means in the 

polymer analogy simply the fact that in the short-wave limit, i.e. for 0 <— X w <^ R c (where is a characteristic 

correlation-length for the glued loops, corresponding to the above-mentioned "bag radius", while X is the 

ri8i w 

wavelength of an excitation) any non-trivial correlation vanishes totally, as if the system had crystallized. 

There is also a correspondence between confinement in QCD - the fact that the colour-field is only different from 

zero in the interiour of hadrons - and the behaviour of the usual magnetic field in the theory of type-II 

ri9i 

superconductors: there the magnetism is confined to the interiour of the Abrikosov flux-line lattice, i.e., the 
London penetration depth A of that theory is analogous to the confinement radius of quantum chromodynamics. 
Mathematically, this correspondendence is supported by the second term, oc gG < j 1 'ipi'y fJ 'T?j'ifij ,on the r.h.s. of the 

Lagrangian. 



Quantum Chromodynamics 



254 



References 

[I] Gell-Mann, Murray (1995). The Quark and the Jaguar. Owl Books. ISBN 978-0805072532. 

[2] Fyodor Tkachov (2009). "A contribution to the history of quarks: Boris Struminsky's 1965 JINR publication". arXiv: 0904.0343 
[physics. hist-ph]. 

[3] B. V. Struminsky, Magnetic moments of barions in the quark model. JINR-Preprint P-1939, Dubna, Russia. Submitted on January 7, 1965. 
[4] N. Bogolubov, B. Struminsky, A. Tavkhelidze. On composite models in the theory of elementary particles. JINR Preprint D-1968, Dubna 
1965. 

[5] A. Tavkhelidze. Proc. Seminar on High Energy Physics and Elementary Particles, Trieste, 1965, Vienna IAEA, 1965, p. 763. 

[6] V. A. Matveev and A. N. Tavkhelidze (INR, RAS, Moscow) The quantum number color, colored quarks and QCD (http://www.inr.ru/ 

quantum.html) (Dedicated to the 40th Anniversary of the Discovery of the Quantum Number Color). Report presented at the 99th Session of 

the JINR Scientific Council, Dubna, 19-20 January 2006. 
[7] J. Polchinski, M. Strassler (2002). "Hard Scattering and Gauge/String duality". Physical Review Letters 88: 31601. 

doi:10.1103/PhysRevLett.88.031601. 
[8] Brower, Richard C; Mathur, Samir D.; Chung-I Tan (2000). "Glueball Spectrum for QCD from AdS Supergravity Duality". 

arXiv:hep-th/0003115 [hep-th]. 

[9] F. Wegner, Duality in Generalized Ising Models and Phase Transitions without Local Order Parameter, J. Math. Phys. 12 (1971) 2259-2272. 

Reprinted in Claudio Rebbi (ed.), Lattice Gauge Theories and Monte Carlo Simulations, World Scientific, 
Singapore (1983), p. 60-73. Abstract: (http://www. tphys.uni-heidelberg.de/~wegner/ Abstracts. html#12) 

[10] Perhaps one can guess that in the "original" model mainly the quarks would fluctuate, whereas in the present one, the "dual" model, mainly 
the gluons do. 

[II] See all standard textbooks on the QCD, e.g., those noted above 

[12] Only at extremely large pressures and or temperatures, e.g. for J 1 ^ g . ]_0-^ K or larger, confinement gives way to a quark-gluon 
plasma. 

[13] Kenneth A. Johnson, The bag model of quark confinement, Scientific American, July 1979 

[14] http://www.aip.org/pnu/2005/split/731-l.html 

[15] D.C. Mattis, Phys. Lett. 56a (1976) 421 

[16] J. Vanninemus and G. Toulouse, J. Phys. C 10 (1977) 537 

[17] E. Fradkin, B.A. Huberman, S. Shenker, Gauge Symmetries in random magnetic systems, Phys. Rev. B 18 (1978) 4783-4794, (http://prb. 

aps . org/abstract/PRB/v 1 8/i9/p4879- 1 ) 
[18] A. Bergmann, A. Owen , Dielectric relaxation spectroscopy of poly[(R)-3-Hydroxybutyrate] (PHD) during crystallization, Polymer 

International 53 (7) (2004) 863-868, (http://www3.interscience.wiley.com/journal/108563755/abstract?CRETRY=l&SRETRY=0) 
[19] Mathematically, the llux-line lattices are described by Emil Artin's braid group, which is nonabelian, since one braid can wind around 

another one. 

Further reading 

• Greiner, Walter ;S chafer, Andreas (1994). Quantum Chromodynamics. Springer. ISBN 0-387-57103-5. 

• Halzen, Francis; Martin, Alan (1984). Quarks & Leptons: An Introductory Course in Modern Particle Physics. 
John Wiley & Sons. ISBN 0-471-88741-2. 

• Creutz, Michael (1985). Quarks, Gluons and Lattices. Cambridge University Press. ISBN 978-0521315357. 

External links 

• Particle data group (http://pdg.lbl.gov/) 

• The millennium prize (http://www.claymath.org/millennium/) for proving confinement (http://www. 
claymath. org/millennium/) 

• Ab Initio Determination of Light Hadron Masses (http://www.sciencemag.org/cgi/content/abstract/322/ 
5905/1224) 

• Andreas S Kronfeld (http://www.sciencemag.org/cgi/content/summary/322/5905/1198) The Weight of the 
World Is Quantum Chromodynamics 

• Andreas S Kronfeld (http://www.iop.Org/EJ/article/1742-6596/125/l/012067/jpconf8_125_012067. 
pdf?request-id=f9ccdf0d-ee26-4856-99fb-ce5bfef07c4c) Quantum chromodynamics with advanced computing 

• Standard model gets right answer (http://www.sciencenews.org/view/generic/id/38788/title/ 
Standard_model_gets_right_answer_for_proton,_neutron_masses) 



Quantum Chromodynamics 



255 



• Quantum Chromodynamics (http ://arxiv. org/abs/hepph/950523 1 ) 

Quantum Geometry 

In theoretical physics, quantum geometry is the set of new mathematical concepts generalizing the concepts of 
geometry whose understanding is necessary to describe the physical phenomena at very short distance scales 
(comparable to Planck length). At these distances, quantum mechanics has a profound effect on physics. 

Each theory of quantum gravity uses the term quantum geometry in a slightly different fashion. String theory, a 
leading candidate for a quantum theory of gravity, uses the term quantum geometry to describe exotic phenomena 
such as T-duality and other geometric dualities, mirror symmetry, topology-changing transitions, minimal possible 
distance scale, and other effects that challenge our usual geometrical intuition. More technically, quantum geometry 
refers to the shape of the spacetime manifold as seen by D-branes which includes the quantum corrections to the 
metric tensor, such as the worldsheet instantons. For example, the quantum volume of a cycle is computed from the 
mass of a brane wrapped on this cycle. 

In an alternative approach to quantum gravity called loop quantum gravity (LQG), the phrase quantum geometry 
usually refers to the formalism within LQG where the observables that capture the information about the geometry 
are now well defined operators on a Hilbert space. In particular, certain physical observables, such as the area, have a 
discrete spectrum. It has also been shown that the loop quantum geometry is non-commutative. 

It is possible (but considered unlikely) that this strictly quantized understanding of geometry will be consistent with 
the quantum picture of geometry arising from string theory. 

Another, quite successful, approach, which tries to reconstruct the geometry of space-time from "first principles" is 
Discrete Lorentzian quantum gravity. 

External links 

• Space and Time: From Antiquity to Einstein and Beyond ^ 

• Quantum Geometry and its Applications 

• Hypercomplex Numbers in Geometry and Physics 

References 

[1] http://cgpg.gravity.psu.edu/people/Ashtekar/articles/spaceandtime.pdf 
[2] http://cgpg.gravity.psu.edu/people/Ashtekar/articles/qgfinal.pdf 
[3] http://hypercomplex.xpsweb.com/articles/221/en/pdf/main-01e.pdf 



Loop Quantum Gravity 



256 



Loop Quantum Gravity 

Loop quantum gravity (LQG), also known as loop gravity and quantum geometry, is a proposed quantum theory 
of spacetime which attempts to reconcile the theories of quantum mechanics and general relativity. Loop quantum 
gravity suggests that space can be viewed as an extremely fine fabric or network "woven" of finite quantised loops of 
excited gravitational fields called spin networks. When viewed over time, these spin networks are called spin foam, 
which should not be confused with quantum foam. A major quantum gravity contender with string theory, loop 
quantum gravity incorporates general relativity without requiring string theory's higher dimensions. 

LQG preserves many of the important features of general relativity, while simultaneously employing quantization of 
both space and time at the Planck scale in the tradition of quantum mechanics. The technique of loop quantization 
was developed for the nonperturbative quantization of diffeomorphism-invariant gauge theory. Roughly, LQG tries 
to establish a quantum theory of gravity in which the very space itself, where all other physical phenomena occur, 
becomes quantized. 

LQG is one of a family of theories called canonical quantum gravity. The LQG theory also includes matter and 
forces, but does not address the problem of the unification of all physical forces the way some other quantum gravity 
theories such as string theory do. 

History of LQG 

In 1986, Abhay Ashtekar reformulated Einstein's field equations of general relativity, using what have come to be 
known as Ashtekar variables, a particular flavor of Einstein-Cartan theory with a complex connection. In 1988, Carlo 
Rovelli and Lee Smolin used this formalism to introduce the loop representation of quantum general relativity, 
which was soon developed by Ashtekar, Rovelli, Smolin and many others. In the Ashtekar formulation, the 
fundamental objects are a rule for parallel transport (technically, a connection) and a coordinate frame (called a 
vierbein) at each point. Because the Ashtekar formulation was background-independent, it was possible to use 
Wilson loops as the basis for a nonperturbative quantization of gravity. Explicit (spatial) diffeomorphism invariance 
of the vacuum state plays an essential role in the regularization of the Wilson loop states. 

Around 1990, Rovelli and Smolin obtained an explicit basis of states of quantum geometry, which turned out to be 
labelled by Roger Penrose's spin networks, and showed that the geometry is quantized, that is, the 
(non-gauge-invariant) quantum operators representing area and volume have a discrete spectrum. In this context, 
spin networks arose as a generalization of Wilson loops necessary to deal with mutually intersecting loops. 
Mathematically, spin networks are related to group representation theory and can be used to construct knot invariants 
such as the Jones polynomial. 

Key concepts of loop quantum gravity 

In the framework of quantum field theory, and using the standard techniques of perturbative calculations, one finds 
that gravitation is non-renormalizable in contrast to the electroweak and strong interactions of the Standard Model of 
particle physics. This implies that there are infinitely many free parameters in the theory and thus that it cannot be 
predictive. 

In general relativity, the Einstein field equations assign a geometry (via a metric) to space-time. Before this, there is 
no physical notion of distance or time measurements. In this sense, general relativity is said to be background 
independent. An immediate conceptual issue that arises is that the usual framework of quantum mechanics, including 
quantum field theory, relies on a reference (background) space-time. Therefore, one approach to finding a quantum 
theory of gravity is to understand how to do quantum mechanics without relying on such a background; this is the 
approach of the canonical quantization/loop quantum gravity/spin foam approaches. 



Loop Quantum Gravity 



257 



Starting with the initial-value-formulation of general 
relativity (cf. the section on General 
relativity#Evolution equations), the result is an 
analogue of the Schrodinger equation called the 
Wheeler-deWitt equation, which some argue is 
ill-defined A major break-through came with the 
introduction of what are now known as Ashtekar 
variables, which represent geometric gravity using 
mathematical analogues of electric and magnetic 
fields. The resulting candidate for a theory of 
quantum gravity is Loop quantum gravity, in which 
space is represented by a network structure called a 



spin network, evolving over time in discrete steps. 



[3] 




Simple spin network of the type used in loop quantum gravity 



Though not proven, it may be impossible to quantize 
gravity in 3+1 dimensions without creating matter and 

energy artifacts. Should LQG succeed as a quantum theory of gravity, the known matter fields will have to be 
incorporated into the theory a posteriori. Many of the approaches now being actively pursued (by Renate Loll, Jan 
Ambj0rn, Lee Smolin, Sundance Bilson-Thompson, Laurent Freidel, Mark B. Wise and others ^ ) combine matter 
with geometry. 

The main successes of loop quantum gravity are: 

1. It is a nonperturbative quantization of 3-space geometry, with quantized area and volume operators. 

2. It includes a calculation of the entropy of black holes. 

3. It replaces the Big Bang spacetime singularity with a Big Bounce. 

These claims are not universally accepted among the physics community, which is presently divided between 
different approaches to the problem of quantum gravity. LQG may possibly be viable as a refinement of either 
gravity or geometry. Many of the core results are rigorous mathematical physics; their physical interpretations 
remain speculative. Three speculative physical interpretations of LQG's core mathematical results are loop 
quantization, Lorentz invariance, General covariance and background independence, discussed below. Another 
physical test for LQG is to reproduce the physics of general relativity coupled with quantum field theory, discussed 
under problems. 



Loop quantization 

At the core of loop quantum gravity is a framework for nonperturbative quantization of diffeomorphism-invariant 
gauge theories, which one might call loop quantization. While originally developed in order to quantize vacuum 
general relativity in 3+1 dimensions, the formalism can accommodate arbitrary spacetime dimensionalities, 
fermions,^ an arbitrary gauge group (or even quantum group), and super symmetry/ 6 ^ and results in a quantization 
of the kinematics of the corresponding diffeomorphism-invariant gauge theory. Much work remains to be done on 
the dynamics, the classical limit and the correspondence principle, all of which are necessary in one way or another 
to make contact with experiment. 

In a nutshell, loop quantization is the result of applying C*-algebraic quantization to a non-canonical algebra of 
gauge-invariant classical observables. Non-canonical means that the basic observables quantized are not generalized 
coordinates and their conjugate momenta. Instead, the algebra generated by spin network observables (built from 
holonomies) and field strength fluxes is used. 

Loop quantization techniques are particularly successful in dealing with topological quantum field theories, where 
they give rise to state-sum/spin-foam models such as the Turaev-Viro model of 2+1 dimensional general relativity. A 



Loop Quantum Gravity 



258 



much studied topological quantum field theory is the so-called BF theory in 3+1 dimensions. Since classical general 
relativity can be formulated as a BF theory with constraints, scientists hope that a consistent quantization of gravity 
may arise from the perturbation theory of BF spin-foam models. 

Lorentz in variance 

LQG is a quantization of a classical Lagrangian field theory which is equivalent to the usual Einstein-Cartan theory 
in that it leads to the same equations of motion describing general relativity with torsion. As such, it can be argued 
that LQG respects local Lorentz invariance. Global Lorentz invariance is broken in LQG just as in general relativity. 
A positive cosmological constant can be realized in LQG by replacing the Lorentz group with the corresponding 
quantum group. 

General covariance and background independence 

General covariance, also known as "diffeomorphism invariance", is the invariance of physical laws under arbitrary 

coordinate transformations. An example of this are the equations of general relativity, where this symmetry is one of 

the defining features of the theory. LQG preserves this symmetry by requiring that the physical states remain 

invariant under the generators of diffeomorphisms. The interpretation of this condition is well understood for purely 

spatial diffemorphisms. However, the understanding of diffeomorphisms involving time (the Hamiltonian constraint) 

T71 

is more subtle because it is related to dynamics and the so-called problem of time in general relativity. A generally 
accepted calculational framework to account for this constraint is yet to be found ^ 

Whether or not Lorentz invariance is broken in the low-energy limit of LQG, the theory is formally background 
independent. The equations of LQG are not embedded in, or presuppose, space and time, except for its invariant 
topology. Instead, they are expected to give rise to space and time at distances which are large compared to the 
Planck length. 



Problems 

While there has been a recent proposal relating to observation of naked singularities, tl0] and doubly special 
relativity, as a part of a program called loop quantum cosmology, as of now there is no experimental observation for 
which loop quantum gravity makes a prediction not made by the Standard Model or general relativity (a problem that 
plagues all current theories of quantum gravity). 

Making predictions from the theory of LQG has been extremely difficult computationally, also a recurring problem 
with modern theories in physics. 

Another problem is that a crucial free parameter in the theory known as the Immirzi parameter can only be computed 
by demanding agreement with Bekenstein and Hawking's calculation of the black hole entropy. Loop quantum 
gravity predicts that the entropy of a black hole is proportional to the area of the event horizon, but does not obtain 
the Bekenstein-Hawking formula S = A/4 unless the Immirzi parameter is chosen to give this value. A prediction 
directly from theory would be preferable. 

Presently, no semiclassical limit recovering general relativity has been shown to exist. This means it remains 
unproven that LQG's description of spacetime at the Planck scale has the right continuum limit, described by general 
relativity with possible quantum corrections. Specifically, the dynamics of the theory is encoded in the Hamiltonian 
constraint, but there is no candidate Hamiltonian (quantum mechanics). Other technical problems includes 
finding off-shell closure of the constraint algebra and physical inner product vector space, coupling to matter fields 
Quantum field theory, fate of the Renormalization of the graviton in Perturbation theory that lead to Ultraviolet 
divergence beyond 2-loops One-loop Feynman diagram in Feynman diagram. . The fate of Lorentz invariance in 
loop quantum gravity remains an open problem. ^ 13 ^ 



Loop Quantum Gravity 



259 



Current LQG research directions attempt to address these known problems, and includes spinfoam models and 
entropic gravity tl5] . 

See also 

• Loop quantum cosmology 

• Heyting algebra 

• Category theory 

• Noncommutative geometry 

• Topos theory 

• C*-algebra 

• Regge calculus 

• Double special relativity 

• Lorentz in variance in loop quantum gravity 

• Immirzi parameter 

• Invariance mechanics 

• spin foam 

• group field theory 

• string-net 

• Kodama state 

• supersymmetry 

Notes 

[I] Cf. section 3 in Kuchaf 1973. 

[2] See Ashtekar 1986, Ashtekar 1987. 

[3] For a review, see Thiemann 2006; more extensive accounts can be found in Rovelli 1998, Ashtekar & Lewandowski 2004 as well as in the 

lecture notes Thiemann 2003. 
[4] See List of loop quantum gravity researchers 

[5] Baez, John, Krasnov,Kirill : Quantization of Diffeomorphism-Invariant Theories with Fermions arXiv:hep-th/97031 12 

[6] Ling,Yi; Smolin, Lee : Supersymmetric Spin Networks and Quantum Supergravity http://www.arxiv.org/abs/hep-th/9904016 

[7] See, e.g., Stuart Kauffman and Lee Smolin "A Possible Solution For The Problem Of Time In Quantum Cosmology" (1997). (http://www. 

edge . org/ 3rd_culture/ smolin/ smolin_p 1 . html) 
[8] See, e.g., Lee Smolin, "The Case for Background Independence", in Dean Rickles, et al. (eds.) The Structural Foundations of Quantum 

Gravity (2006), p 196 ff. 
[9] For a highly technical explanation, see, e.g., Carlo Rovelli (2004). Quantum Gravity, p 13 ff. 

[10] Goswami et al.. "Quantum evaporation of a naked singularity" (http://arxiv.org/abs/gr-qc/0506129). Physical Review Letters. . Retrieved 
2010-01-02. 

[II] http://arxiv.org/abs/hep-th/0501 1 14 
[12] http://arxiv.org/abs/hep-th/0501 1 14 

[13] "Testing Einstein's special relativity with Fermi's short hard gamma-ray burst GRB090510" (http://arxiv.org/abs/0908. 1832). Fermi 

GBM/LAT Collaborations. . Retrieved August 13, 2009. 
[ 1 4] http :// math. ucr . edu/home/baez/ week280 . html 

[15] Newtonian gravity in loop quantum gravity (http://arxiv.org/abs/1001.3668vl), Lee Smolin, 2010 



Loop Quantum Gravity 



260 



References 

• Topical Reviews 

• Rovelli, Carlo (1998), "Loop Quantum Gravity" (http://www.livingreviews.org/lrr-1998-l), Living Rev. 
Relativity 1, retrieved 2008-03-13 

• Thiemann, Thomas (2003), "Lectures on Loop Quantum Gravity" (http://arxiv.org/abs/gr-qc/0210094), 
Lect. Notes Phys. 631: 41-135 

• Ashtekar, Abhay; Lewandowski, Jerzy (2004), "Background Independent Quantum Gravity: A Status Report" 
(http://arxiv.org/abs/gr-qc/0404018), Class. Quant. Grav. 21: R53-R152, 
doi:10.1088/0264-9381/21/15/R01,arXiv:gr-qc/0404018 

• Carlo Rovelli and Marcus Gaul, Loop Quantum Gravity and the Meaning of Diffeomorphism Invariance, 
e-print available as gr-qc/9910079 (http://arxiv.org/abs/gr-qc/9910079). 

• Lee Smolin, The case for background independence, e-print available as hep-th/0507235 (http://arxiv.org/ 
abs/hep-th/0507235). 

• Alejandro Corichi, Loop Quantum Geometry: A primer, e-print available as (http://arxiv.org/abs/gr-qc/ 
0507038v2). 

• Alejandro Perez, Introduction to loop quantum gravity and spin foams, e-print available as (http://arxiv.org/ 
abs/gr-qc/0409061v3). 

• Hermann Nicolai and Kasper Peeters Loop and spin foam quantum gravity: A Brief guide for beginners., 
e-print available as (http://arxiv.org/abs/hep-th/0601129v2). 

• Popular books: 

• Lee Smolin, Three Roads to Quantum Gravity 

• Carlo Rovelli, Che cos'e il tempo? Che cos'e lo spazio?, Di Renzo Editore, Roma, 2004. French translation: 
Qu'est ce que le temps? Qu'est ce que Vespace?, Bernard Gilson ed, Brussel, 2006. English translation: What is 
Time? What is space?, Di Renzo Editore, Roma, 2006. 

• Julian Barbour, The End of Time: The Next Revolution in Our Understanding of the Universe 

• Musser, George (2008), The Complete Idiot's Guide to String Theory, Indianapolis: Alpha, pp. 368, 

ISBN 978-1-59-257702-6 - Focuses on string theory but has an extended discussion of loop gravity as well. 

• Magazine articles: 

• Lee Smolin, "Atoms of Space and Time," Scientific American, January 2004 

• Martin Bojowald, "Following the Bouncing Universe," Scientific American, October 2008 

• Easier introductory, expository or critical works: 

• Abhay Ashtekar, Gravity and the quantum, e-print available as gr-qc/04 10054 (http://arxiv.org/abs/gr-qc/ 
0410054) (2004) 

• John C. Baez and Javier Perez de Muniain, Gauge Fields, Knots and Quantum Gravity, World Scientific 
(1994) 

• Carlo Rovelli, A Dialog on Quantum Gravity, e-print available as hep-th/03 10077 (http://arxiv.org/abs/ 
hep-th/03 10077) (2003) 

• More advanced introductory/expository works: 

• Carlo Rovelli, Quantum Gravity, Cambridge University Press (2004); draft available online (http://www.cpt. 
univ-mrs.fr/~rovelli/book.pdf) 

• Thomas Thiemann, Introduction to modern canonical quantum general relativity, e-print available as 
gr-qc/01 10034 (http://arxiv.org/abs/gr-qc/01 10034) 

• Thomas Thiemann, Introduction to Modern Canonical Quantum General Relativity, Cambridge University 
Press (2007) 

• Abhay Ashtekar, New Perspectives in Canonical Gravity, Bibliopolis (1988). 

• Abhay Ashtekar, Lectures on Non-Perturbative Canonical Gravity, World Scientific (1991) 



Loop Quantum Gravity 



261 



• Rodolfo Gambini and Jorge Pullin, Loops, Knots, Gauge Theories and Quantum Gravity, Cambridge 
University Press (1996) 

• Hermann Nicolai, Kasper Peeters, Marija Zamaklar, Loop quantum gravity: an outside view, e-print available 
as hep-th/0501114 (http://arxiv.org/abs/hep-th/0501114) 

• H. Nicolai and K. Peeters, Loop and Spin Foam Quantum Gravity: A Brief Guide for Beginners, e-print 
available as hep-th/0601129 (http://lanl.arxiv.org/abs/hep-th/0601129) 

• T. Thiemann The LQG - String: Loop Quantum Gravity Quantization of String Theory (http://arxiv.org/abs/ 
hep-th/0401172vl)(2004) 

• Conference proceedings: 

• John C. Baez (ed.), Knots and Quantum Gravity 

• Fundamental research papers: 

• Ashtekar, Abhay (1986), "New variables for classical and quantum gravity", Phys. Rev. Lett. 57 (18): 
2244-2247, doi:10.1103/PhysRevLett.57.2244, PMID 10033673 

• Ashtekar, Abhay (1987), "New Hamiltonian formulation of general relativity", Phys. Rev. D36: 1587-1602, 
doi:10.1103/PhysRevD.36.1587 

• Roger Penrose, Angular momentum: an approach to combinatorial space-time in Quantum Theory and 
Beyond, ed. Ted Bastin, Cambridge University Press, 1971 

• Carlo Rovelli and Lee Smolin, Knot theory and quantum gravity, Phys. Rev. Lett., 61 (1988) 1155 

• Carlo Rovelli and Lee Smolin, Loop space representation of quantum general relativity, Nuclear Physics B331 
(1990) 80-152 

• Carlo Rovelli and Lee Smolin, Discreteness of area and volume in quantum gravity, Nucl. Phys., B442 (1995) 
593-622, e-print available as gr-qc/9411005 (http://xxx.lanl.gov/abs/gr-qc/9411005) 

• Kuchar, Karel (1973), "Canonical Quantization of Gravity", in Israel, Werner, Relativity, Astrophysics and 
Cosmology, D. Reidel, pp. 237-288, ISBN 90-277-0369-8 

• Thiemann, Thomas (2006), Loop Quantum Gravity: An Inside View, arXiv:hep-th/0608210 

External links 

• "Loop Quantum Gravity" by Carlo Rovelli (http://cgpg.gravity.psu.edu/people/Ashtekar/articles/rovelli03. 
pdf) Physics World, November 2003 

• Quantum Foam and Loop Quantum Gravity (http://universe-review.ca/R01-07-quantumfoam.htm) 

• Abhay Ashtekar: Semi-Popular Articles . Some excellent popular articles suitable for beginners about space, time, 
GR, and LQG. (http://cgpg.gravity.psu.edu/people/Ashtekar/articles.html) 

• Loop Quantum Gravity: Lee Smolin. (http://www.edge.org/3rd_culture/smolin03/smolin03_index.html) 

• Loop Quantum Gravity on arxiv.org (http://xstructure.inr.ac. ru/x-bin/theme3.py?level=2&index 1=2056 15) 

• A list of LQG references catered to fresh graduates (http://sps.nus.edu.sg/~wongjian/lqg.html) 

• Loop Quantum Gravity Lectures Online (http://www.perimeterinstitute.ca/Events/ 
Introduction_to_Quantum_Gravity/Introduction_to_Quantum_Gravity/) by Lee Smolin 

• Spin networks, spin foams and loop quantum gravity (http://jdc.math.uwo.ca/spin-foams/) 

• Wired magazine, News: Moving Beyond String Theory (http ://www. wired. com/news/technology/0,7 1828-0. 
html) 

• April 2006 Scientific American Special Issue, A Matter of Time, has Lee Smolin LQG Article Atoms of Space and 
Time (http :// www. sciam.com/special/toc. cfm?issueid=40&sc=rt_nav_list) 

• September 2006, The Economist, article Looping the loop (http://www.economist.com/science/displaystory. 
cfm?story_id=7963608) 

• Gamma-ray Large Area Space Telescope: http://glast.gsfc.nasa.gov/ 



Loop Quantum Gravity 



262 



• Zeno meets modern science, (http://uk.arxiv.org/abs/physics/0505042) Article from Acta Physica Polonica B 
(http://th-www.if.uj.edu.pl/acta/) by Z.K. Silagadze. 

• Did pre-big bang universe leave its mark on the sky? (http://space.newscientist.com/article/mgl9826514. 
300-did-prebig-bang-universe-leave-its-mark-on-the-sky.html?feedld=online-news_rss20) - According to a 
model based on "loop quantum gravity" theory, a parent universe that existed before ours may have left an imprint 
{New Scientist, 10 April 2008) 

Quantum Algebraic Topology 

In physics, topological order ^ is a new kind of order (a new kind of organization of particles) in a quantum state 
that is beyond the Landau symmetry-breaking description. It cannot be described by local order parameters and long 
range correlations. However, topological orders can be described by a new set of quantum numbers, such as ground 
state degeneracy, quasiparticle fractional statistics, edge states, topological entropy, etc. Roughly speaking, 
topological order is a pattern of long-range quantum entanglement in quantum states. States with different 
topological orders can change into each other only through a phase transition. 

Background 

Although all matter is formed by atoms, matter can have very different properties and appear in very different forms, 
such as solid, liquid, superfluid, magnet, etc. According to condensed matter physics and the principle of emergence, 
the different properties of materials originate from the different ways in which the atoms are organized in the 
materials. Those different organizations of the atoms (or other particles) are formally called the orders in the 
materials. 

Atoms can organize in many ways which lead to many different orders and many different types of materials. 
Landau symmetry-breaking theory provides a general understanding of these different orders. It points out that 
different orders really correspond to different symmetries in the organizations of the constituent atoms. As a material 
changes from one order to another order (i.e., as the material undergoes a phase transition), what happens is that the 
symmetry of the organization of the atoms changes. 

For example, atoms have a random distribution in a liquid, so a liquid remains the same as we displace it by an 
arbitrary distance. We say that a liquid has a continuous translation symmetry. After a phase transition, a liquid can 
turn into a crystal. In a crystal, atoms organize into a regular array (a lattice). A lattice remains unchanged only when 
we displace it by a particular distance (integer times of lattice constant), so a crystal has only discrete translation 
symmetry. The phase transition between a liquid and a crystal is a transition that reduces the continuous translation 
symmetry of the liquid to the discrete symmetry of the crystal. Such change in symmetry is called symmetry 
breaking. The essence of the difference between liquids and crystals is therefore that the organizations of atoms have 
different symmetries in the two phases. 

Landau symmetry-breaking theory is a very successful theory. For a long time, physicists believed that Landau 
symmetry-breaking theory describes all possible orders in materials, and all possible (continuous) phase transitions. 

The discovery and characterization of topological order 

However, since late 1980s, it has become gradually apparent that Landau symmetry-breaking theory may not 
describe all possible orders. In an attempt to explain high temperature superconductivity people introduced chiral 
spin state. ^ ^ At first, physicists still wanted to use Landau symmetry-breaking theory to describe the chiral spin 
state. They identified the chiral spin state as a state that breaks the time reversal and parity symmetries, but not the 
spin rotation symmetry. This should be the end of story according to Landau's symmetry breaking description of 
orders. However, it was quickly realized that there are many different chiral spin states that have exactly the same 



Quantum Algebraic Topology 



263 



symmetry, so symmetry alone was not enough to characterize different chiral spin states. This means that the chiral 

spin states contain a new kind of order that is beyond the usual symmetry description ^ . The proposed, new kind of 

order was named "topological order" ^ . (The name "topological order" is motivated by the low energy effective 

theory of the chiral spin states which is a topological quantum field theory (TQFT) ^ ^ ^ ). New quantum 

numbers, such as ground state degeneracy ^ and the non-Abelian Berry's phase of degenerate ground states . 

were introduced to characterize the different topological orders in chiral spin states. Recently, it was shown that 

ri2i r 1 3i 

topological orders can also be characterized by topological entropy . 

But experiments soon indicated that chiral spin states do not describe high-temperature superconductors, and the 
theory of topological order became a theory with no experimental realization. However, the similarity between chiral 
spin states and quantum Hall states allows one to use the theory of topological order to describe different quantum 
Hall states. ^ Just like chiral spin states, different quantum Hall states all have the same symmetry and are beyond 
the Landau symmetry-breaking description. One finds that the different orders in different quantum Hall states can 
indeed be described by topological orders, so the topological order does have experimental realizations. 

Fractional quantum Hall (FQH) states were discovered in 1982 ^ ^ before the introduction of the concept of 
topological order. But FQH states are not the first experiemntally discovered topologically ordered states. 
Superconductors discovered in 1911 are the first, which have a Z2 topological order (note, however, that 
superconductivity does fall within the Ginzburg-Landau theory description of phase transitions - in fact the 
prediction of the vortex state in superocnductors was one of the main successes of Ginzburg-Landau theory). tl7] tl8] 



Mechanism of topological order 

ri9i 

A large class of topological orders is realized through a mechanism called string-net condensation . This class of 
topological orders can be classified by utilizing tensor category (or monoidal category) theory. One finds that 
string-net condensation can generate infinitely many different types of topological orders, which may indicate that 
there are many different new types of materials remaining to be discovered. 

The collective motions of condensed strings give rise to excitations above the string-net condensed states. Those 
excitations turn out to be gauge bosons. The ends of strings are defects which correspond to another type of 
excitations. Those excitations are the gauge charges and can carry Fermi or fractional statistics. t20] 

The condensations of other extended objects such as "membranes", 1 "brane-nets" , L and fractals also lead to 
topologically ordered phases ^ and "quantum glassiness" ^ 



Mathematical foundation of topological order 

We know that group theory is the mathematical foundation of symmetry breaking orders. What is the mathematical 
foundation of topological order? The string-net condensation suggests that tensor category (or monoidal category) 
theory may be the mathematical foundation of topological order. Quantum operator algebra is a very important 
mathematical tool in studying topological orders. Some also suggest that topological order is mathematically 
described by extended quantum symmetry 



Applications 

The materials described by Landau symmetry-breaking theory have had a substantial impact on technology. For 
example, Ferromagnetic materials that break spin rotation symmetry can be used as the media of digital information 
storage. A hard drive made of ferromagnetic materials can store gigabytes of information. Liquid crystals that break 
the rotational symmetry of molecules find wide application in display technology; nowadays one can hardly find a 
household without a liquid crystal display somewhere in it. Crystals that break translation symmetry lead to well 
defined electronic bands which in turn allow us to make semiconducting devices such as transistors. 



Quantum Algebraic Topology 



264 



Different types of Topologically orders are even richer than different types of symmetry-breaking orders. This 
suggests their potential for exciting, novel applications. 

One theorized application would be to use topologically ordered states as media for quantum computing in a 
technique known as topological quantum computing. A topologically ordered state is a state with complicated 
non-local quantum entanglement. The non-locality means that the quantum entanglement in a topologically ordered 
state is distributed among many different particles. As a result, the pattern of quantum entanglements cannot be 
destroyed by local perturbations. This significantly reduces the effect of decoherence. This suggests that if we use 
different quantum entanglements in a topologically ordered state to encode quantum information, the information 
may last much longer. ^ The quantum information encoded by the topological quantum entanglements can also be 
manipulated by dragging the topological defects around each other. This process may provide a physical apparatus 
for performing quantum computations. Therefore, topologically ordered states may provide natural media for 
both quantum memory and quantum computation. Such realizations of quantum memory and quantum computation 

T281 

may potentially be made fault tolerant. 

Topologically ordered states in general have a special property that they contain non-trivial boundary states. In many 

cases, those boundary states become perfect conducting channel that can conduct electricity without generating 
T291 

heat. LZ - 7J This can be another potential application of topological order in electronic devices. For example, the 
topological insulator ^ ^ is a simple example of topological order. The boundary states of topological insulator 
play a key role in the detection and the application of topological insulators. 



Potential impact 

Why is topological order important? Landau symmetry-breaking theory is a cornerstone of condensed matter 

physics. It is used to define the territory of condensed matter research. The existence of topological order appears to 

indicate that nature is much richer than Landau symmetry-breaking theory has so far indicated. The exciting time of 

condensed matter physics is still ahead of us. Some suggest that topological order (or more precisely, string-net 

T321 

condensation) and the local bosonic (spin) models have the potential to provide a unified origin for photons, 

T331 

electrons and other elementary particles in our universe. 



References 

[I] Xiao-Gang Wen, Topological Orders in Rigid States, (http://dao.mit.edu/~wen/pub/topo.pdf) Int. J. Mod. Phys. B4, 239 (1990) 

[2] .G. Bednorz and K.A. Mueller (1986). "Possible high TC superconductivity in the Ba-La-Cu-O system". Z. Phys. B64 (2): 189-193. 
doi:10.1007/BF01303701. 

[3] V. Kalmeyer and R. B. Laughlin, Phys. Rev. Lett., 59, 2095 (1987), "Equivalence of the resonating-valence-bond and fractional quantum Hall 
states" 

[4] Xiao-Gang Wen, F. Wilczek and A. Zee, Phys. Rev., B39, 11413 (1989), "Chiral Spin States and Superconductivity" 

[5] Xiao-Gang Wen, Phys. Rev. B, 40, 7387 (1989), "Vacuum Degeneracy of Chiral Spin State in Compactified Spaces" 

[6] Xiao-Gang Wen, Intl. J. Mod. Phys., B4, 239 (1990), "Topological Orders in Rigid States" (http://dao.mit.edu/~wen/pub/topo.pdf) 

[7] Atiyah, Michael (1988), "Topological quantum field theories", Publications Mathe'matiques de 1'IHeS (68): 175-186, MR1001453, ISSN 

1618-1913, http://www.numdam.org/item?id=PMIHES_1988_68__175_0 
[8] Witten, Edward (1988), "Topological quantum field theory", Communications in Mathematical Physics 777 (3): 353—386, MR953828, ISSN 

0010-3616, http://projecteuclid.Org/euclid.cmp/l 104161738 
[9] Yetter D.N., TQFTs from homotopy 2-types, J. Knot Theory 2 (1993),113-123. 

[10] Xiao-Gang Wen, Phys. Rev. B, 40, 7387 (1989), "Vacuum Degeneracy of Chiral Spin State in Compactified Spaces" 

[II] Xiao-Gang Wen, Intl. J. Mod. Phys., B4, 239 (1990), "Topological Orders in Rigid States" (http://dao.mit.edu/~wen/pub/topo.pdf) 
[12] Alexei Kitaev and John Preskill, Phys. Rev. Lett. 96, 110404 (2006), "Topological Entanglement Entropy" 

[13] Levin M. and Wen X-G., Detecting topological order in a ground state wave function., Phys. Rev. Letts. ,96(11), 110405, (2006) (http:// 

link.aps.org/doi/10. 1 103/PhysRevLett.96. 1 10405) 
[14] Xiao-Gang Wen and Qian Niu, Phys. Rev. B41, 9377 (1990), "Ground state degeneracy of the FQH states in presence of random potential 

and on high genus Riemann surfaces" (http://dao.mit.edu/~wen/pub/topWN.pdf) 
[15] D. C. Tsui and H. L. Stormer and A. C. Gossard, Phys. Rev. Lett., 48, 1559 (1982), "Two-Dimensional Magnetotransport in the Extreme 

Quantum Limit" 



Quantum Algebraic Topology 



265 



[16] R. B. Laughlin, Phys. Rev. Lett., 50, 1395 (1983), "Anomalous Quantum Hall Effect: An Incompressible Quantum Fluid with Fractionally 
Charged Excitations" 

[17] Xiao-Gang Wen, Mean Field Theory of Spin Liquid States with Finite Energy Gaps and Topological Orders, Phys. Rev. B44, 2664 (1991) 

(http://link.aps.org/doi/10. 1 103/PhysRevB.44.2664). 
[18] T. H. Hansson, Vadim Oganesyan, S. L. Sondhi, Superconductors are topologically ordered (http://arxiv.org/abs/cond-mat/0404327), 

Annals Of Physics vol. 313, 497 (2004) 
[19] Michael Levin, Xiao-Gang Wen, Phys. Rev. B, 71, 045110 (2005), "String-net condensation: A physical mechanism for topological phases" 
[20] Levin M. and Wen X-G., Fermions, strings, and gauge fields in lattice spin models., Phys. Rev. B 67, 245316, (2003). 
[21] Hamma etal, 2005 
[22] Bombin, M.A. Martin-Delgado, 2006 

[23] Xiao-Gang Wen, Int. J. Mod. Phys. B5, 1641 (1991); Topological Orders and Chern-Simons Theory in strongly correlated quantum liquid, a 
review containing comments on topological orders in higher dimensions and/or in Higgs phases; also introduced a dimension index (DI) to 
characterize the robustness of the ground state degeneracy of a topologically ordered state. If DI is less or equal to 1 , then topological orders 
cannot exist at finite temperature. 

[24] Quantum Glas sines s.,Chamon C, Phys. Rev. Lett., 94, 040402, (2005). (http://link.aps.org/doi/10.1103/PhysRevLett.94.040402) 

[25] Algebraic Topology Foundations of Supersymmetry and Symmetry Breaking in Quantum Field Theory and Quantum Gravity: A Review., 

Baianu, I.C., J.F. Glazebrook and R. Brown., SIGMA-08 1030,(2009), 78 pages. 
[26] Eric Dennis, Alexei Kitaev, Andrew Landahl, and John Preskill, J. Math. Phys., 43, 4452 (2002), Topological quantum memory 
[27] Michael H. Freedman, Alexei Kitaev, Michael J. Larsen, and Zhenghan Wang, Bull. Amer. Math. Soc, 40, 31 (2003), "Topological 

quantum computation" 

[28] A. Yu. Kitaev Ann. Phys. (N.Y.), 303, 1 (2003), Fault-tolerant quantum computation by anyons 

[29] Xiao-Gang Wen, Phys. Rev. B, 43, 11025 (1991), "Gapless Boundary Excitations in the FQH States and in the Chiral Spin States" (http:// 

dao.mit.edu/~wen/pub/bdry.pdf) 
[30] S. Murakami, N. Nagaosa, and S.-C. Zhang, Phys. Rev. Lett. 93, 156804 (2004). 
[31] C. Kane and E. Mele, Phys. Rev. Lett. 95, 226801 (2005). 
[32] http://arxiv.org/abs/hep-th/0507 1 1 8v2 

[33] Levin M. and Wen X-G., Colloquium: Photons and electrons as emergent phenomena, Rev. Mod. Phys. 77, 871 (2005) (http://arxiv.org/ 
abs/hep-th/0507118v2), 4 pages; also, Quantum ether: Photons and electrons from a rotor model., arXiv:hep-th/0507118 (2007). 

References by categories 
Fractional quantum Hall states 

• D. C. Tsui and H. L. Stormer and A. C. Gossard, Phys. Rev. Lett., 48, 1559 (1982), "Two-Dimensional 
Magnetotransport in the Extreme Quantum Limit" 

• R. B. Laughlin, Phys. Rev. Lett., 50, 1395 (1983), "Anomalous Quantum Hall Effect: An Incompressible 
Quantum Fluid with Fractionally Charged Excitations" 

Chiral spin states 

• V. Kalmeyer and R. B. Laughlin, Phys. Rev. Lett., 59, 2095 (1987), "Equivalence of the resonating-valence-bond 
and fractional quantum Hall states" 

• Xiao-Gang Wen, F. Wilczek and A. Zee, Phys. Rev., B39, 1 1413 (1989), "Chiral Spin States and 
Superconductivity" 



Quantum Algebraic Topology 



266 



Early characterization of FQH states 

• Off-diagonal long-range order, oblique confinement, and the fractional quantum Hall effect, S. M. Girvin and A. 
H. MacDonald, Phys. Rev. Lett., 58, 1252 (1987) 

• Effective-Field-Theory Model for the Fractional Quantum Hall Effect, S. C. Zhang and T. H. Hansson and S. 
Kivelson, Phys. Rev. Lett., 62, 82 (1989) 

Topological order 

• Xiao-Gang Wen, Phys. Rev. B, 40, 7387 (1989), "Vacuum Degeneracy of Chiral Spin State in Compactified 
Spaces" 

• Xiao-Gang Wen, Int. J. Mod. Phys., B4, 239 (1990), "Topological Orders in Rigid States" (http://dao.mit.edu/ 
~wen/pub/topo.pdf) 

• Xiao-Gang Wen, Quantum Field Theory of Many Body Systems — From the Origin of Sound to an Origin of Light 
and Electrons, Oxford Univ. Press, Oxford, 2004. 

Characterization of topological order 

• D. Arovas and J. R. Schrieffer and F. Wilczek, Phys. Rev. Lett., 53, 722 (1984), "Fractional Statistics and the 
Quantum Hall Effect" 

• Xiao-Gang Wen and Qian Niu, Phys. Rev. B41, 9377 (1990), "Ground state degeneracy of the FQH states in 
presence of random potential and on high genus Riemann surfaces" (http://dao.mit.edu/~wen/pub/topWN. 
pdf) 

• Xiao-Gang Wen, Phys. Rev. B, 43, 1 1025 (1991), "Gapless Boundary Excitations in the FQH States and in the 
Chiral Spin States" (http://dao.mit.edu/~wen/pub/bdry.pdf) 

• Alexei Kitaev and John Preskill, Phys. Rev. Lett. 96, 1 10404 (2006), "Topological Entanglement Entropy" 

• Michael Levin and Xiao-Gang Wen, Phys. Rev. Lett. 96, 1 10405 (2006), "Detecting Topological Order in a 
Ground State Wave Function" (http://link.aps.org/doi/10.1103/PhysRevLett.96.110405) 

Effective theory of topological order 

• Quantum field theory and the Jones polynomial, E. Witten, Comm. Math. Phys., 121, 351 (1989) 
Mechanism of topological order 

• Michael Levin, Xiao-Gang Wen, Phys. Rev. B, 71, 0451 10 (2005), String-net condensation: A physical 
mechanism for topological phases, 

• Chamon, C, Phys. Rev. Lett. 94, 040402 (2005), Quantum Glassiness in Strongly Correlated Clean Systems: An 
Example of Topological Overprotection (http://link.aps.org/doi/10.1103/PhysRevLett.94.040402) 

• Alioscia Hamma, Paolo Zanardi, Xiao-Gang Wen, Phys.Rev. B72 035307 (2005), String and Membrane 
condensation on 3D lattices 

• H. Bombin, M.A. Martin-Delgado, cond-mat/0607736, Exact Topological Quantum Order in D=3 and Beyond: 
Branyons and Brane-Net Condensates 



Quantum Algebraic Topology 



267 



Quantum computing 

• Chetan Nayak, Steven H. Simon (http://www-thphys.physics.ox.ac.uk/people/SteveSimon/), Ady Stern, 
Michael Freedman, Sankar Das Sarma, http://www.arxiv.org/abs/0707.1889, 2007, "Non-Abelian Anyons 
and Topological Quantum Computation" 

• A. Yu. Kitaev, Ann. Phys. (N.Y.), 303, 1 (2003), Fault-tolerant quantum computation by anyons 

• Michael H. Freedman, Alexei Kitaev, Michael J. Larsen, and Zhenghan Wang, Bull. Amer. Math. Soc, 40, 31 
(2003), "Topological quantum computation" 

• Eric Dennis, Alexei Kitaev, Andrew Landahl, and John Preskill, J. Math. Phys., 43, 4452 (2002), Topological 
quantum memory 

• Ady Stern and Bertrand I. Halperin, Phys. Rev. Lett., 96, 016802 (2006), Proposed Experiments to probe the 
Non-Abelian nu=5/2 Quantum Hall State 

Emergence of elementary particles 

• Xiao-Gang Wen, Phys. Rev. D68, 024501 (2003), Quantum order from string-net condensations and origin of 
light and massless fermions 

• M. Levin and Xiao-Gang Wen, Fermions, strings, and gauge fields in lattice spin models., Phys. Rev. B 67, 
245316, (2003). 

• M. Levin and Xiao-Gang Wen, Colloquium: Photons and electrons as emergent phenomena, Rev. Mod. Phys. 77, 
Nu 12:19, 9 April 2009 (UTQ871 (2005), 4 pages; also, Quantum ether: Photons and electrons from a rotor 
model, arXiv:hep-th/05071 18,2007. 

• Zheng-Cheng Gu and Xiao-Gang Wen, gr-qc/0606100, A lattice bosonic model as a quantum theory of gravity, 

Quantum operator algebra 

• Yetter D.N., TQFTs from homotopy 2-types, /. Knot Theory 2 (1993), 1 13— 123. 

• Landsman N. P. and Ramazan B., Quantization of Poisson algebras associated to Lie algebroids, in Proc. Conf. on 
Groupoids in Physics, Analysis and Geometry(Bou\der CO, 1999)', Editors J. Kaminker et al.,159{ 192 Contemp. 
Math. 282, Amer. Math. Soc, Providence RI, 2001, (also math{ph/001005.) 

• Non-Abelian Quantum Algebraic Topology (NAQAT) 20 Nov. (2008),87 pages, Baianu, I.C. (http:// 
planetphy sics . org/?op=getobj &from=lec&id=6 1 ) 

• Levin A. and Olshanetsky M., Hamiltonian Algebroids and deformations of complex structures on Riemann 
curves, hep-th/0301078vl. 

• Xiao-Gang Wen, Yong-Shi Wu and Y. Hatsugai., Chiral operator product algebra and edge excitations of a FQH 
droplet (pdf), Nucl. Phys. B422, 476 (1994): Used chiral operator product algebra to construct the bulk wave 
function, characterize the topological orders and calculate the edge states for some non-Abelian FQH states. 

• Xiao-Gang Wen and Yong-Shi Wu., Chiral operator product algebra hidden in certain FQH states (pdf), Nucl. 
Phys. B419, 455 (1994): Demonstrated that non-Abelian topological orders are closely related to chiral operator 
product algebra (instead of conformal field theory). 

• Non-Abelian theory. (http://planetphysics.org/encyclopedia/NonAbelianTheory.html) 

• R. Brown et al. A Non-Abelian, Categorical Ontology of Spacetimes and Quantum Gravity., Axiomathes, Volume 
17, Numbers 3-4 / December, (2007), pages 353— 408., Springer, Netherlands, ISSN 1122-1151 (Print) 1572-8390 
(Online), doi: 10. 1007/sl05 16-007-9012-1 . 

• Ronald Brown, Higgins, P. J. and R. Sivera,:(2009), Nonabelian Algebraic Topology., vols.l and 2, Ch. U. Press, 
in press, (http ://www.bangor. ac.uk/~mas0 1 0/nonab-t/partI0 1 0604.pdf) 

• A Bibliography for Categories and Algebraic Topology Applications in Theoretical Physics (http:// 
planetphy sics . org/encyclopedia/ 

BibliographyForCategoryTheoryAndAlgebraicTopologyApplicationsInTheoreticalPhysics.html) 



Quantum Algebraic Topology 



268 



• Quantum Algebraic Topology (QAT) (http://planetphysics.org/encyclopedia/ 
QuantumAlgebraicTopologyTopics.html) 

Commutativity 

In mathematics an operation is commutative if 

changing the order of the operands does not 
change the end result. It is a fundamental 
property of many binary operations, and many 
mathematical proofs depend on it. The 
commutativity of simple operations, such as 
multiplicaton and addition of numbers, was for 
many years implicitly assumed and the property 
was not named until the 19th century when 
mathematics started to become formalized. By 
contrast, division and subtraction are not 
commutative. 

Common uses 

The commutative property (or commutative law) is a property associated with binary operations and functions. 
Similarly, if the commutative property holds for a pair of elements under a certain binary operation then it is said that 
the two elements commute under that operation. 

In group and set theory, many algebraic structures are called commutative when certain operands satisfy the 
commutative property. In higher branches of mathematics, such as analysis and linear algebra the commutativity of 

well known operations (such as addition and multiplication on real and complex numbers) is often used (or implicitly 

A \ • -p [1] [2] [3] 
assumed) m proors. 

Mathematical definitions 

The term "commutative" is used in several related senses ^ 

1. A binary operation * on a set S is said to be commutative if: 

Vx, y£S:x*y = y*x 

- An operation that does not satisfy the above property is called noncommutative. 

2. One says that x commutes with y under * if: 

x * y = y * x 

3. A binary function f.AxA — > B is said to be commutative if: 

Vx,y G A : f(z,y) = f(y,x) 




Example showing the commutativity of addition (3 + 2 = 2 + 3) 



Commutativity 



269 



History and etymology 



f ct f j sont telles qu'elles dcmnent 
el que soit Fordre dans lequel on les> 
ppelees commutatives entre elles* 

; aRz=JLaz ; 

The first known use of the term was in a French Journal published in 
1814 



Records of the implicit use of the commutative 
property go back to ancient times. The Egyptians used 
the commutative property of multiplication to simplify 
computing products ^ Euclid is known to have 
assumed the commutative property of multiplication in 

roi 

his book Elements. Formal uses of the commutative 
property arose in the late 18th and early 19th centuries, 
when mathematicians began to work on a theory of 
functions. Today the commutative property is a well 
known and basic property used in most branches of 
mathematics. 

The first recorded use of the term commutative was in a memoir by Francois Servois in 18 14,^ ^ which used the 
word commutatives when describing functions that have what is now called the commutative property. The word is a 
combination of the French word commuter meaning "to substitute or switch" and the suffix -ative meaning "tending 
to" so the word literally means "tending to substitute or switch." The term then appeared in English in Philosophical 



Transactions of the Royal Society in 1844 



[9] 



Related properties 



Associativity 

The associative property is closely related to the commutative 
property. The associative property of an expression containing two 
or more occurrences of the same operator states that the order in 
which operations are performed does not affect the final result, as 
long as the order of terms is not changed. In contrast, the 
commutative property states that the order of the terms does not 
affect the final result. 




Graph showing the symmetry of the addition function 



Symmetry can be directly linked to commutativity. When a 

commutative operator is written as a binary function then the resulting function is symmetric across the line y = x. 
As an example, if we let a function / represent addition (a commutative operation) so that f(x,y) = x + y then / is a 
symmetric function which can be seen in the image on the right. 

For binary relations, a symmetric relation is analogous to a commutative operation, in that if a relation R is 
symmetric, then a Rb <=> bRa • 



Commutativity 



270 



Examples 

Commutative operations in everyday life 

• Putting on socks resembles a commutative operation, since which sock is put on first is unimportant. Either way, 
the end result (having both socks on), is the same. 

• The commutativity of addition is observed when paying for an item with cash. Regardless of the order in which 
the bills handed over, they always give the same total. 

Commutative operations in mathematics 

Two well-known examples of commutative binary operations are:^ 

• The addition of real numbers, which is commutative since 

V(y, z) £R:y + z = z + y 
For example 4 + 5 = 5 + 4, since both expressions equal 9. 

• The multiplication of real numbers, which is commutative since 

Vy, z G R : yz = zy 
For example, 3x5 = 5x3, since both expressions equal 15. 

• Further examples of commutative binary operations include addition and multiplication of complex numbers, 
addition and scalar multiplication of vectors, and intersection and union of sets. 

Noncommutative operations in everyday life 

• Concatenation, the act of joining character strings together, is a noncommutative operation. For example 

EA + T = EAT / TEA = T + EA 

• Washing and drying clothes resembles a noncommutative operation; washing and then drying produces a 
markedly different result to drying and then washing. 

• The twists of the Rubik's Cube are noncommutative. This is studied in group theory. 

Noncommutative operations in mathematics 

• Subtraction is noncommutative since 0 — 1^1 — 0 

• Division is noncommutative since 1/2 ^ 2/1 

• Infinite addition is not (necessarily) commutative: 

1-1 + 1-1 + 1-1 + 1-1 + .. .<1 

whereas 

1 + 1- 1 + 1 + 1 -1 + 1 + 1- l + ... = oc 

• Matrix multiplication is noncommutative since 



0 


2 




1 


1 




0 


1 




0 


1 




1 


1 




0 1 


0 


1 




0 


1 




0 


1 


0 


1 




0 


1 




0 1 



• The vector product (or cross product) of two vectors in three dimensions is anti-commutative, i.e., b x a = (a x 
b). 

Some noncommutative binary operations are:^ 11] 



Commutativity 



271 



Mathematical structures and commutativity 

• A commutative semigroup is a set endowed with a total, associative and commutative operation. 

• If the operation additionally has an identity element, we have a commutative monoid 

T21 

• An abelian group, or commutative group is a group whose group operation is commutative. 

• A commutative ring is a ring whose multiplication is commutative. (Addition in a ring is always 
commutative y 1 2 ^ 

ri3i 

• In a field both addition and multiplication are commutative. 



Non-commuting operators in quantum mechanics 

In quantum mechanics as formulated by Schrodinger, physical variables are represented by linear operators such as x 
(meaning multiply by x), and d/dx. These two operators do not commute as may be seen by considering the effect of 
their products x (d/dx) and (d/dx) x on a one-dimensional wave function ip(x): 

d d 

x—ij) = xip 7^ ——xip = ip + xip 
dx dx 

According to the uncertainty principle of Heisenberg, if the two operators representing a pair of variables do not 
commute, then that pair of variables are mutually complementary which means that they cannot be simultaneously 
measured or known precisely. For example, the position and the linear momentum of a particle are represented 
respectively (in the x-direction) by the operators x and (h/2jti)d/dx (where h is Planck's constant). This is the same 
example except for the constant (h/2jti), so again the operators do not commute and the physical meaning is that the 
position and linear momentum in a given direction are complementary. 



Notes 

[I] Axler,p.2 
[2] Gallian,p.34 
[3] p. 26,87 

[4] Krowne, p. 1 

[5] Weisstein, Commute, p.l 

[6] Lumpkin, p. 11 

[7] Gay and Shute, p.? 

[8] O'Conner and Robertson, Real Numbers 

[9] Cabillon and Miller, Commutative and Distributive 

[10] O'Conner and Robertson, Servois 

[II] Yark,p.l 
[12] Gallianp.236 
[13] Gallianp.250 



References 
Books 

• Axler, Sheldon (1997). Linear Algebra Done Right, 2e. Springer. ISBN 0-387-98258-2. 

Abstract algebra theory. Covers commutativity in that context. Uses property throughout book. 

• Goodman, Frederick (2003). Algebra: Abstract and Concrete, Stressing Symmetry, 2e. Prentice Hall. 
ISBN 0-13-067342-0. 

Abstract algebra theory. Uses commutativity property throughout book. 

• Gallian, Joseph (2006). Contemporary Abstract Algebra, 6e. Boston, Mass.: Houghton Mifflin. 
ISBN 0-618-51471-6. 

Linear algebra theory. Explains commutativity in chapter 1, uses it throughout. 



Commutativity 



272 



Articles 

• http://www.ethnomath.org/resources/lumpkinl997.pdf Lumpkin, B. (1997). The Mathematical Legacy Of 
Ancient Egypt - A Response To Robert Palter. Unpublished manuscript. 

Article describing the mathematical ability of ancient civilizations. 

• Robins, R. Gay, and Charles C. D. Shute. 1987. The Rhind Mathematical Papyrus: An Ancient Egyptian Text. 
London: British Museum Publications Limited. ISBN 0-7141-0944-4 

Translation and interpretation of the Rhind Mathematical Papyrus. 

Online resources 

• Krowne, Aaron, Commutative (http://planetmath.org/encyclopedia/Commutative.html) at PlanetMath., 
Accessed 8 August 2007. 

Definition of commutativity and examples of commutative operations 

• Weisstein, Eric W., " Commute (http://mathworld.wolfram.com/Commute.html)" from MathWorld., Accessed 
8 August 2007. 

Explanation of the term commute 

• Yark (http://planetmath.org/?op=getuser&id=2760). Examples of non- commutative operations (http:// 
planetmath.org/encyclopedia/ExampleOfCommutative.html) at PlanetMath., Accessed 8 August 2007 

Examples proving some noncommutative operations 

• O'Conner, J J and Robertson, E F. MacTutor history of real numbers (http://www-history.mcs.st-andrews.ac. 
uk/HistTopics/Real_numbers_l.html), Accessed 8 August 2007 

Article giving the history of the real numbers 

• Cabillon, Julio and Miller, Jeff. Earliest Known Uses Of Mathematical Terms (http://jeff560.tripod.eom/c. 
html), Accessed 22 November 2008 

Page covering the earliest uses of mathematical terms 

• O'Conner, J J and Robertson, E F. MacTutor biography of Francois Servois (http://www-groups.dcs.st-and.ac. 
uk/~history/Biographies/Servois.html), Accessed 8 August 2007 

Biography of Francois Servois, who first used the term 



Noncommutative quantum field theory 



273 



Noncommutative quantum field theory 

In mathematical physics, noncommutative quantum field theory (or quantum field theory on noncommutative 
spacetime) is an application of noncommutative mathematics to the spacetime of quantum field theory that is an 
outgrowth of noncommutative geometry and index theory in which the coordinate functions 1 ^ are noncommutative. 
One commonly studied version of such theories has the "canonical" commutation relation: 

which means that (with any given set of axes), it is impossible to accurately measure the position of a particle with 
respect to more than one axis. In fact, this leads to an uncertainty relation for the coordinates analogous to the 
Heisenberg uncertainty principle. 

Various lower limits have been claimed for the noncommutative scale, (i.e. how accurately positions can be 
measured) but there is currently no experimental evidence in favour of such theory or grounds for ruling them out. 

One of the novel features of noncommutative field theories is the UV/IR mixing phenomenon in which the physics 
at high energies affects the physics at low energies which does not occur in quantum field theories in which the 
coordinates commute. 

Other features include violation of Lorentz in variance due to the preferred direction of noncommutativity. 
Relativistic in variance can however be retained in the sense of twisted Poincare in variance of the theory . The 
causality condition is modified from that of the commutative theories. 

History and motivation 

Heisenberg was the first to suggest extending noncommutativity to the coordinates as a possible way of removing the 
infinite quantities appearing in field theories before the renormalization procedure was developed and had gained 
acceptance. The first paper on the subject was published in 1947 by Hartland Snyder. The success of the 
renormalization method resulted in little attention being paid to the subject for some time. In the 1980s, 
mathematicians, most notably Alain Connes, developed noncommutative geometry. Among other things, this work 
generalized the notion of differential structure to a noncommutative setting. This led to an operator algebraic 
description of noncommutative space-times, and the development of a Yang-Mills theory on a noncommutative 
torus. 

The particle physics community became interested in the noncommutative approach because of a paper by Nathan 
Seiberg and Edward Witten.^ They argued in the context of string theory that the coordinate functions of the 
endpoints of open strings constrained to a D-brane in the presence of a constant Neveu- Schwartz B-field - 
equivalent to a constant magnetic field on the brane — would satisfy the noncommutative algebra set out above. The 
implication is that a quantum field theory on noncommutative spacetime can be interpreted as a low energy limit of 
the theory of open strings. 

A paper by Sergio Doplicher, Klaus Fredenhagen and John Roberts ^ set out another motivation for the possible 
noncommutativity of space-time. Their arguments goes as follows: According to general relativity, when the energy 
density grows sufficiently large, a black hole is formed. On the other hand according to the Heisenberg uncertainty 
principle, a measurement of a space-time separation causes an uncertainty in momentum inversely proportional to 
the extent of the separation. Thus energy whose scale corresponds to the uncertainty in momentum is localized in the 
system within a region corresponding to the uncertainty in position. When the separation is small enough, the 
Schwarzschild radius of the system is reached and a black hole is formed, which prevents any information from 
escaping the system. Thus there is a lower bound for the measurement of length. A sufficient condition for 
preventing gravitational collapse can be expressed as an uncertainty relation for the coordinates. This relation can in 
turn be derived from a commutation relation for the coordinates. 



Noncommutative quantum field theory 



274 



Footnotes 

[1] It is possible to have a noncommuting time coordinate, but this causes many problems such as the violation of unitarity of the S-matrix. 
Hence most research is restricted to so-called "space-space" noncommutativity. There have been attempts to avoid these problems by 
redefining the perturbation theory. However, string theory derivations of noncommutative coordinates excludes time-space noncommutativity. 

[2] See, for example, Shiraz Minwalla, Mark Van Raamsdonk, Nathan Seiberg (2000) " Noncommutative Perturbative Dynamics, (http://arxiv. 
org/abs/hep-th/99 12072)" Journal of High Energy Physics, and Alec Matusis, Leonard Susskind, Nicolaos Toumbas (2000) " The IR/UV 
Connection in the Non- Commutative Gauge Theories, (http://arxiv.org/abs/hep-th/0002075)" Journal of High Energy Physics. 

[3] M. Chaichian, P. Presnajder, A. Tureanu (2005) " New concept of relativistic invariance in NC space-time: twisted Poincare symmetry and its 
implications, (http://arxiv.org/abs/hep-th/0409096)" Phys. Rev. Letters 94: . 

[4] Seiberg, N. and E. Witten (1999) " String Theory and Noncommutative Geometry, (http://arxiv.org/abs/hep-th/9908142)" Journal of High 
Energy Physics . 

[5] Sergio Doplicher, Klaus Fredenhagen, John E. Roberts (1995) " The quantum structure of spacetime at the Planck scale and quantum fields, 
(http://arxiv.org/abs/hep-th/0303037)" Commun. Math. Phys. 172: 187-220. 

Further reading 

• M.R. Douglas and N. A. Nekrasov (2001) " Noncommutative field theory, (http://prola.aps.org/abstract/RMP/ 
v73/i4/p977_l?qid=a81527af6e5a2fa2&qseq=l&show=10) M Rev. Mod. Phys. 73: 977 - 1029. 

• Szabo, R. J. (2003) " Quantum Field Theory on Noncommutative Spaces, (http://arxiv.org/abs/hep-th/ 
0109162)" Physics Reports 378: 207-99. An expository article on noncommutative quantum field theories. 

• Noncommutative quantum field theory, see statistics (http://xstructure.inr.ac. ru/x-bin/theme3.py?level=2& 
indexl=-173391) on arxiv.org 

Noncommutative standard model 



In theoretical particle physics, the non-commutative Standard Model, mainly due to the French mathematician 
Alain Connes, uses his noncommutative geometry to devise an extension of the Standard Model to include a 
modified form of general relativity. This unification implies a few constraints on the parameters of the Standard 
Model. Under an additional assumption, known as the "big desert" hypothesis, one of these constraints determines 
the mass of the Higgs boson to be around 170 GeV, comfortably within the range of the Large Hadron Collider. 
Recent Tevatron experiments exclude a Higgs mass of 158 to 175 GeV at the 95% confidence level J 1 ^ 

Background 

Current physical theory features four elementary forces: the gravitational force, the electromagnetic force, the weak 
force, and the strong force. Gravity has an elegant and experimentally precise theory: Einstein's general relativity. It 
is based on Riemannian geometry and interprets the gravitational force as curvature of space-time. Its Lagrangian 
formulation requires only two empirical parameters, the gravitational constant and the cosmological constant. 

The other three forces also have a Lagrangian theory, called the Standard Model. Its underlying idea is that they are 
mediated by the exchange of spin-1 particles, the so-called gauge bosons. The one responsible for electromagnetism 
is the photon. The weak force is mediated by the W and Z bosons; the strong force, by gluons. The gauge Lagrangian 
is much more complicated than the gravitational one: at present, it involves some 30 real parameters, a number that 
could increase. What is more, the gauge Lagrangian must also contain a spin 0 particle, the Higgs boson, to give 
mass to the spin 1/2 and spin 1 particles. This particle has yet to be observed, and if it is not detected at the Large 
Hadron Collider in Geneva, the consistency of the Standard Model is in doubt. 

Alain Connes has generalized Bernhard Riemann's geometry to noncommutative geometry. It describes spaces with 
curvature and uncertainty. Historically, the first example of such a geometry is quantum mechanics, which 
introduced Heisenberg's uncertainty relation by turning the classical observables of position and momentum into 
noncommuting operators. Noncommutative geometry is still sufficiently similar to Riemannian geometry that 



Noncommutative standard model 



275 



Connes was able to rederive general relativity. In doing so, he obtained the gauge Lagrangian as a companion of the 
gravitational one, a truly geometric unification of all four fundamental interactions. Connes has thus devised a fully 
geometric formulation of the Standard Model, where all the parameters are geometric invariants of a 
noncommutative space. A result is that parameters like the electron mass are now analogous to purely mathematical 
constants like pi. 

Notes 

[1] The TEVNPH Working Group (http://arxiv.org/abs/1007.4587) 

References 

• Alain Connes (1994) Noncommutative geometry, (http://www.alainconnes.org/docs/book94bigpdf.pdf) 
Academic Press. ISBN 0-12-185860-X. 

• (1995) "Noncommutative geometry and reality," /. Math. Phys. 36: 6194. 

• (1996) " Gravity coupled with matter and the foundation of noncommutative geometry, (http://arxiv.org/ 

abs/hep-th/9603053)" Comm. Math. Phys. 155: 109. 

• (2006) " Noncommutative geometry and physics, (http://www.alainconnes.org/docs/einsymp.pdf)" 

• and M. Marcolli, Noncommutative Geometry: Quantum Fields and Motives. (http://www.alainconnes. 

org/en/downloads.php) American Mathematical Society (2007). 

• Chamseddine, A., A. Connes (1996) " The spectral action principle, (http://arxiv.org/abs/hep-th/9606001)" 
Comm. Math. Phys. 182: 155. 

• Chamseddine, A., A. Connes, M. Marcolli (2007) " Gravity and the Standard Model with neutrino mixing, (http:/ 
/arxiv.org/abs/hep-th/0610241)" Adv. Theor. Math. Phys. 11: 991. 

• Jureit, Jan-H., Thomas Krajewski, Thomas Schucker, and Christoph A. Stephan (2007) " On the noncommutative 
standard model, (http://arxiv.org/abs/0705.0489)" Acta Phys. Polon. B38: 3181-3202. 

• Schucker, Thomas (2005) Forces from Connes's geometry, (http://arxiv.org/abs/hep-th/0111236) Lecture 
Notes in Physics 659, Springer. 

External links 

• Alain Connes official website (http://www.alainconnes.org/) with downloadable papers. (http://www. 
alainconnes . org/en/do wnloads .php) 

• Alain Connes's Standard Model, (http://resonaances.blogspot.com/2007/02/alain-connes-standard-model. 
html) 



Nonabelian Gauge Theory 



276 



Nonabelian Gauge Theory 

In physics, a gauge theory is a type of field theory in which the Lagrangian is invariant under a continuous group of 
local transformations. 

The term gauge refers to redundant degrees of freedom in the Lagrangian. The transformations between possible 
gauges, called gauge transformations, form a Lie group which is referred to as the symmetry group or the gauge 
group of the theory. Associated with any Lie group is the Lie algebra of group generators. For each group generator 
there necessarily arises a corresponding vector field called the gauge field. Gauge fields are included in the 
Lagrangian to ensure its in variance under the local group transformations (called gauge invariance). When such a 
theory is quantized, the quanta of the gauge fields are called gauge bosons. If the symmetry group is 
non-commutative, the gauge theory is referred to as non-abelian, the usual example being the Yang-Mills theory. 

Gauge theories are important as the successful field theories explaining the dynamics of elementary particles. 
Quantum electrodynamics is an abelian gauge theory with the symmetry group U(l) and has one gauge field, the 
electromagnetic field, with the photon being the gauge boson. The Standard Model is a non-abelian gauge theory 
with the symmetry group U(l)xSU(2)xSU(3) and has a total of twelve gauge bosons: the photon, three weak bosons 
and eight gluons. 

Many powerful theories in physics are described by Lagrangians which are invariant under some symmetry 
transformation groups. When they are invariant under a transformation identically performed at every point in the 
space in which the physical processes occur, they are said to have a global symmetry. The requirement of local 
symmetry, the cornerstone of gauge theories, is a stricter constraint. In fact, a global symmetry is just a local 
symmetry whose group's parameters are fixed in space-time. Gauge symmetries can be viewed as analogues of the 
equivalence principle of general relativity in which each point in spacetime is allowed a choice of local reference 
(coordinate) frame. Both symmetries reflect a redundancy in the description of a system. 

Historically, these ideas were first stated in the context of classical electromagnetism and later in general relativity. 
However, the modern importance of gauge symmetries appeared first in the relativistic quantum mechanics of 
electrons — quantum electrodynamics, elaborated on below. Today, gauge theories are useful in condensed matter, 
nuclear and high energy physics among other subfields. 

History and importance 

The earliest field theory having a gauge symmetry was Maxwell's formulation of electrodynamics in 1864. The 
importance of this symmetry remained unnoticed in the earliest formulations. Similarly unnoticed, Hilbert had 
derived the Einstein field equations by postulating the invariance of the action under a general coordinate 
transformation. Later Hermann Weyl, in an attempt to unify general relativity and electromagnetism, conjectured 
(incorrectly, as it turned out) that Eichinvarianz or invariance under the change of scale (or "gauge") might also be a 
local symmetry of general relativity. After the development of quantum mechanics, Weyl, Vladimir Fock and Fritz 
London modified gauge by replacing the scale factor with a complex quantity and turned the scale transformation 
into a change of phase — aU(l) gauge symmetry. This explained the electromagnetic field effect on the wave 
function of a charged quantum mechanical particle. This was the first widely recognised gauge theory, popularised 
by Pauli in the 1940s. [1] 

In 1954, attempting to resolve some of the great confusion in elementary particle physics, Chen Ning Yang and 
Robert Mills introduced non-abelian gauge theories as models to understand the strong interaction holding together 
nucleons in atomic nuclei. (Ronald Shaw, working under Abdus Salam, independently introduced the same notion in 
his doctoral thesis.) Generalizing the gauge invariance of electromagnetism, they attempted to construct a theory 
based on the action of the (non-abelian) SU(2) symmetry group on the isospin doublet of protons and neutrons. This 
is similar to the action of the U(l) group on the spinor fields of quantum electrodynamics. In particle physics the 



Nonabelian Gauge Theory 



277 



emphasis was on using quantized gauge theories. 

This idea later found application in the quantum field theory of the weak force, and its unification with 
electromagnetism in the electroweak theory. Gauge theories became even more attractive when it was realized that 
non-abelian gauge theories reproduced a feature called asymptotic freedom. Asymptotic freedom was believed to be 
an important characteristic of strong interactions. This motivated searching for a strong force gauge theory. This 
theory, now known as quantum chromodynamics, is a gauge theory with the action of the SU(3) group on the color 
triplet of quarks. The Standard Model unifies the description of electromagnetism, weak interactions and strong 
interactions in the language of gauge theory. 

In the 1970s, Sir Michael Atiyah began studying the mathematics of solutions to the classical Yang-Mills equations. 
In 1983, Atiyah's student Simon Donaldson built on this work to show that the differentiable classification of smooth 
4-manifolds is very different from their classification up to homeomorphism. Michael Freedman used Donaldson's 
work to exhibit exotic R 4 s, that is, exotic differentiable structures on Euclidean 4-dimensional space. This led to an 
increasing interest in gauge theory for its own sake, independent of its successes in fundamental physics. In 1994, 
Edward Witten and Nathan Seiberg invented gauge-theoretic techniques based on supersymmetry which enabled the 
calculation of certain topological invariants. These contributions to mathematics from gauge theory have led to a 
renewed interest in this area. 

The importance of gauge theories for physics stems from the tremendous success of the mathematical formalism in 
providing a unified framework to describe the quantum field theories of electromagnetism, the weak force and the 
strong force. This theory, known as the Standard Model, accurately describes experimental predictions regarding 
three of the four fundamental forces of nature, and is a gauge theory with the gauge group SU(3) x SU(2) x U(l). 
Modern theories like string theory, as well as some formulations of general relativity, are, in one way or another, 
gauge theories. 

Hi 

See Pickering for more about the history of gauge and quantum field theories. 

Description 

Global and local symmetries 

In physics, the mathematical description of any physical situation usually contains excess degrees of freedom; the 
same physical situation is equally well described by many equivalent mathematical configurations. For instance, in 
Newtonian dynamics, if two configurations are related by a Galilean transformation — an inertial change of reference 
frame — they represent the same physical situation. These transformations form a group of "symmetries" of the 
theory, and a physical situation corresponds not to an individual mathematical configuration but to a class of 
configurations related to one another by this symmetry group. This idea can be generalized to include local as well as 
global symmetries, analogous to much more abstract "changes of coordinates" in a situation where there is no 
preferred "inertial" coordinate system that covers the entire physical system. A gauge theory is a mathematical 
model that has symmetries of this kind, together with a set of techniques for making physical predictions consistent 
with the symmetries of the model. 

Example of global symmetry 

When a quantity occurring in the mathematical configuration is not just a number but has some geometrical 
significance, such as a velocity or an axis of rotation, its representation as numbers arranged in a vector or matrix is 
also changed by a coordinate transformation. For instance, if one description of a pattern of fluid flow states that the 
fluid velocity in the neighborhood of (x=l, y=0) is 1 m/s in the positive x direction, then a description of the same 
situation in which the coordinate system has been rotated clockwise by 90 degrees will state that the fluid velocity in 
the neighborhood of (x=0, y=l) is 1 m/s in the positive y direction. The coordinate transformation has affected both 
the coordinate system used to identify the location of the measurement and the basis in which its value is expressed. 



Nonabelian Gauge Theory 



278 



As long as this transformation is performed globally (affecting the coordinate basis in the same way at every point), 
the effect on values that represent the rate of change of some quantity along some path in space and time as it passes 
through point P is the same as the effect on values that are truly local to P. 

Use of fiber bundles to describe local symmetries 

In order to adequately describe physical situations in more complex theories, it is often necessary to introduce a 
"coordinate basis" for some of the objects of the theory that do not have this simple relationship to the coordinates 
used to label points in space and time. (In mathematical terms, the theory involves a fiber bundle in which the fiber 
at each point of the base space consists of possible coordinate bases for use when describing the values of objects at 
that point.) In order to spell out a mathematical configuration, one must choose a particular coordinate basis at each 
point (a local section of the fiber bundle) and express the values of the objects of the theory (usually "fields" in the 
physicist's sense) using this basis. Two such mathematical configurations are equivalent (describe the same physical 
situation) if they are related by a transformation of this abstract coordinate basis (a change of local section, or gauge 
transformation) . 

In most gauge theories, the set of possible transformations of the abstract gauge basis at an individual point in space 
and time is a finite-dimensional Lie group. The simplest such group is U(l), which appears in the modern 
formulation of quantum electrodynamics (QED) via its use of complex numbers. QED is generally regarded as the 
first, and simplest, physical gauge theory. The set of possible gauge transformations of the entire configuration of a 
given gauge theory also forms a group, the gauge group of the theory. An element of the gauge group can be 
parameterized by a smoothly varying function from the points of spacetime to the (finite-dimensional) Lie group, 
whose value at each point represents the action of the gauge transformation on the fiber over that point. 

A gauge transformation with constant parameter at every point in space and time is analogous to a rigid rotation of 
the geometric coordinate system; it represents a global symmetry of the gauge representation. As in the case of a 
rigid rotation, this gauge transformation affects expressions that represent the rate of change along a path of some 
gauge-dependent quantity in the same way as those that represent a truly local quantity. A gauge transformation 
whose parameter is not a constant function is referred to as a local symmetry; its effect on expressions that involve a 
derivative is qualitatively different from that on expressions that don't. (This is analogous to a non-inertial change of 
reference frame, which can produce a Coriolis effect.) 

Gauge fields 

The "gauge covariant" version of a gauge theory accounts for this effect by introducing a gauge field (in 
mathematical language, an Ehresmann connection) and formulating all rates of change in terms of the covariant 
derivative with respect to this connection. The gauge field becomes an essential part of the description of a 
mathematical configuration. A configuration in which the gauge field can be eliminated by a gauge transformation 
has the property that its field strength (in mathematical language, its curvature) is zero everywhere; a gauge theory is 
not limited to these configurations. In other words, the distinguishing characteristic of a gauge theory is that the 
gauge field does not merely compensate for a poor choice of coordinate system; there is generally no gauge 
transformation that makes the gauge field vanish. 

When analyzing the dynamics of a gauge theory, the gauge field must be treated as a dynamical variable, similarly to 
other objects in the description of a physical situation. In addition to its interaction with other objects via the 
covariant derivative, the gauge field typically contributes energy in the form of a "self-energy" term. One can obtain 
the equations for the gauge theory by: 

• starting from a naive ansatz without the gauge field (in which the derivatives appear in a "bare" form); 

• listing those global symmetries of the theory that can be characterized by a continuous parameter (generally an 
abstract equivalent of a rotation angle); 

• computing the correction terms that result from allowing the symmetry parameter to vary from place to place; and 



Nonabelian Gauge Theory 



279 



• reinterpreting these correction terms as couplings to one or more gauge fields, and giving these fields appropriate 
self-energy terms and dynamical behavior. 

This is the sense in which a gauge theory "extends" a global symmetry to a local symmetry, and closely resembles 
the historical development of the gauge theory of gravity known as general relativity. 

Physical experiments 

Gauge theories are used to model the results of physical experiments, essentially by: 

• limiting the universe of possible configurations to those consistent with the information used to set up the 
experiment, and then 

• computing the probability distribution of the possible outcomes that the experiment is designed to measure. 

The mathematical descriptions of the "setup information" and the "possible measurement outcomes" (loosely 
speaking, the "boundary conditions" of the experiment) are generally not expressible without reference to a particular 
coordinate system, including a choice of gauge. (If nothing else, one assumes that the experiment has been 
adequately isolated from "external" influence, which is itself a gauge-dependent statement.) Mishandling gauge 
dependence in boundary conditions is a frequent source of anomalies in gauge theory calculations, and gauge 
theories can be broadly classified by their approaches to anomaly avoidance. 

Continuum theories 

The two gauge theories mentioned above (continuum electrodynamics and general relativity) are examples of 
continuum field theories. The techniques of calculation in a continuum theory implicitly assume that: 

• given a completely fixed choice of gauge, the boundary conditions of an individual configuration can in principle 
be completely described; 

• given a completely fixed gauge and a complete set of boundary conditions, the principle of least action determines 
a unique mathematical configuration (and therefore a unique physical situation) consistent with these bounds; 

• the likelihood of possible measurement outcomes can be determined by: 

• establishing a probability distribution over all physical situations determined by boundary conditions that are 
consistent with the setup information, 

• establishing a probability distribution of measurement outcomes for each possible physical situation, and 

• convolving these two probability distributions to get a distribution of possible measurement outcomes 
consistent with the setup information; and 

• fixing the gauge introduces no anomalies in the calculation, due either to gauge dependence in describing partial 
information about boundary conditions or to incompleteness of the theory. 

These assumptions are close enough to valid, across a wide range of energy scales and experimental conditions, to 
allow these theories to make accurate predictions about almost all of the phenomena encountered in daily life, from 
light, heat, and electricity to eclipses and spaceflight. They fail only at the smallest and largest scales (due to 
omissions in the theories themselves) and when the mathematical techniques themselves break down (most notably 
in the case of turbulence and other chaotic phenomena). 

Quantum field theories 

Other than these "classical" continuum field theories, the most widely known gauge theories are quantum field 
theories, including quantum electrodynamics and the Standard Model of elementary particle physics. The starting 
point of a quantum field theory is much like that of its continuum analog: a gauge-co variant action integral which 
characterizes "allowable" physical situations according to the principle of least action. However, continuum and 
quantum theories differ significantly in how they handle the excess degrees of freedom represented by gauge 
transformations. Continuum theories, and most pedagogical treatments of the simplest quantum field theories, use a 



Nonabelian Gauge Theory 



280 



gauge fixing prescription to reduce the orbit of mathematical configurations that represent a given physical situation 
to a smaller orbit related by a smaller gauge group (the global symmetry group, or perhaps even the trivial group). 

More sophisticated quantum field theories, in particular those which involve a non-abelian gauge group, break the 
gauge symmetry within the techniques of perturbation theory by introducing additional fields (the Faddeev-Popov 
ghosts) and counterterms motivated by anomaly cancellation, in an approach known as BRST quantization. While 
these concerns are in one sense highly technical, they are also closely related to the nature of measurement, the limits 
on knowledge of a physical situation, and the interactions between incompletely specified experimental conditions 
and incompletely understood physical theory. The mathematical techniques that have been developed in order to 
make gauge theories tractable have found many other applications, from solid-state physics and crystallography to 
low-dimensional topology. 

Classical gauge theory 
Classical electromagnetism 

Historically, the first example of gauge symmetry to be discovered was classical electromagnetism. In static 
electricity, one can either discuss the electric field, E, or its corresponding electric potential, V. Knowledge of one 
makes it possible to find the other, except that potentials differing by a constant, V — > V + C , correspond to the 
same electric field. This is because the electric field relates to changes in the potential from one point in space to 
another, and the constant C would cancel out when subtracting to find the change in potential. In terms of vector 
calculus, the electric field is the gradient of the potential, E = — WV- Generalizing from static electricity to 
electromagnetism, we have a second potential, the vector potential A, with 



where / is any function that depends on position and time. The fields remain the same under the gauge 
transformation, and therefore Maxwell's equations are still satisfied. That is, Maxwell's equations have a gauge 
symmetry. 

An example: Scalar 0(#i) gauge theory 

The remainder of this section requires some familiarity with classical or quantum field theory, and the use of 



Definitions in this section: gauge group, gauge field, interaction Lagrangian, gauge boson. 

The following illustrates how local gauge invariance can be "motivated" heuristically starting from global symmetry 
properties, and how it leads to an interaction between fields which were originally non-interacting. 

Consider a set of n non-interacting scalar fields, with equal masses m. This system is described by an action which is 
the sum of the (usual) action for each scalar field ip% 



B = Vx A . 

The general gauge transformations now become not just V — > V + C but 

A + V/ 





Lagrangians. 




The Lagrangian (density) can be compactly written as 



Nonabelian Gauge Theory 



281 



by introducing a vector of fields 

The term is Einstein notation for the partial derivative of $in each of the four dimensions. It is now transparent 
that the Lagrangian is invariant under the transformation 
$ h-> <J>' = 

whenever G is a constant matrix belonging to the n-by-n orthogonal group 0(n). This is seen to preserve the 
Lagrangian since the derivative of $will transform identically to $and both quantities appear inside dot products 
in the Lagrangian (orthogonal transformations preserve the dot product). 

This characterizes the global symmetry of this particular Lagrangian, and the symmetry group is often called the 
gauge group; the mathematical term is structure group, especially in the theory of G-structures. Incidentally, 
Noether's theorem implies that in variance under this group of transformations leads to the conservation of the 
current 

J* = id^ T T a ® 

where the matrices are generators of the SO(^z) group. There is one conserved current for every generator. 

Now, demanding that this Lagrangian should have local 0(n) -in variance requires that the G matrices (which were 
earlier constant) should be allowed to become functions of the space-time coordinates x. 

Unfortunately, the G matrices do not "pass through" the derivatives, when G = G(x), 

The failure of the derivative to commute with "G" introduces an additional term (in keeping with the product rule) 
which spoils the invariance of the Lagrangian. In order to rectify this we define a new derivative operator such that 
the derivative of $will again transform identically with $ 

(Dp®)' = GDfl. 

This new "derivative" is called a covariant derivative and takes the form 

Dp = dp + gAp 

Where g is called the coupling constant - a quantity defining the strength of an interaction. After a simple 
calculation we can see that the gauge field A(x) must transform as follows 

A'p = GApG -1 — \d ft G)G~ 1 

The gauge field is an element of the Lie algebra, and can therefore be expanded as 

a 

There are therefore as many gauge fields as there are generators of the Lie algebra. 
Finally, we now have a locally gauge invariant Lagrangian 

Aoc = \(D^) T D^ - l -m 2 ® T ®. 

Pauli calls gauge transformation of the first type to the one applied to fields as while the compensating 
transformation in A is sa id to be a gauge transformation of the second type. 



Nonabelian Gauge Theory 



282 



The difference between this Lagrangian and the original globally 
gauge-invariant Lagrangian is seen to be the interaction 
Lagrangian 




Feynman diagram of scalar bosons interacting via a 
gauge boson 



Ant = !$ T A^>$ + |(d„$) T A"$ + ^-(A^fA^. 

This term introduces interactions between the n scalar fields just as a consequence of the demand for local gauge 
invariance. However, to make this interaction physical and not completely arbitrary, the mediator A(x) needs to 
propagate in space. That is dealt with in the next section by adding yet another term, , to the Lagrangian. In the 

quantized version of the obtained classical field theory, the quanta of the gauge field A(x) are called gauge bosons. 
The interpretation of the interaction Lagrangian in quantum field theory is of scalar bosons interacting by the 
exchange of these gauge bosons. 

The Yang-Mills Lagrangian for the gauge field 

The picture of a classical gauge theory developed in the previous section is almost complete, except for the fact that 
to define the covariant derivatives D, one needs to know the value of the gauge field A(x)at all space-time points. 
Instead of manually specifying the values of this field, it can be given as the solution to a field equation. Further 
requiring that the Lagrangian which generates this field equation is locally gauge invariant as well, one possible form 
for the gauge field Lagrangian is (conventionally) written as 

£ gf = -^TV(F^) 

with 

F fU/ = ^[D fl ,D v ] 
W 

and the trace being taken over the vector space of the fields. This is called the Yang-Mills action. Other gauge 
invariant actions also exist (e.g. nonlinear electrodynamics, Born-Infeld action, Chern-Simons model, theta term 
etc.). 

Note that in this Lagrangian term there is no field whose transformation counterweighs the one of A • Invariance of 
this term under gauge transformations is a particular case of a priori classical (geometrical) symmetry. This 
symmetry must be restricted in order to perform quantization, the procedure being denominated gauge fixing, but 
even after restriction, gauge transformations may be possible. 1 

The complete Lagrangian for the gauge theory is now 

— ^loc H" ^gf — ^global ~T" ^int ~~T" ^gf 



Nonabelian Gauge Theory 



283 



An example: Electrodynamics 

As a simple application of the formalism developed in the previous sections, consider the case of electrodynamics, 
with only the electron field. The bare-bones action which generates the electron field's Dirac equation is 

The global symmetry for this system is 
1p h-> e i0 ljj. 

The gauge group here is U(l), just the phase angle of the field, with a constant 6. 
"Localising this symmetry implies the replacement of 9 by 9(x). 
An appropriate covariant derivative is then 

Dp = dp - i—Ap. 

Identifying the "charge" e with the usual electric charge (this is the origin of the usage of the term in gauge theories), 
and the gauge field A(x) with the four- vector potential of electromagnetic field results in an interaction Lagrangian 

Ant = ^(l)r^l)^(l) = J»{x)Ap( X ). 

where J^(x)is the usual four vector electric current density. The gauge principle is therefore seen to naturally 
introduce the so-called minimal coupling of the electromagnetic field to the electron field. 

Adding a Lagrangian for the gauge field ^4 /x (x)in terms of the field strength tensor exactly as in electrodynamics, 
one obtains the Lagrangian which is used as the starting point in quantum electrodynamics. 

£ QED = ^{ihc-fD,. - mc 2 )ip - -^-F^F^. 

See also: Dirac equation, Maxwell's equations, Quantum electrodynamics 

Mathematical formalism 

Gauge theories are usually discussed in the language of differential geometry. Mathematically, a gauge is just a 
choice of a (local) section of some principal bundle. A gauge transformation is just a transformation between two 
such sections. 

Although gauge theory is dominated by the study of connections (primarily because it's mainly studied by 
high-energy physicists), the idea of a connection is not central to gauge theory in general. In fact, a result in general 
gauge theory shows that affine representations (i.e. affine modules) of the gauge transformations can be classified as 
sections of a jet bundle satisfying certain properties. There are representations which transform covariantly pointwise 
(called by physicists gauge transformations of the first kind), representations which transform as a connection form 
(called by physicists gauge transformations of the second kind, an affine representation) and other more general 
representations, such as the B field in BF theory. There are more general nonlinear representations (realizations), but 
are extremely complicated. Still, nonlinear sigma models transform nonlinearly, so there are applications. 

If there is a principal bundle P whose base space is space or spacetime and structure group is a Lie group, then the 
sections of P form a principal homogeneous space of the group of gauge transformations. 

connection (gauge connection) define this principal bundle, yielding a covariant derivative V in each associated 
vector bundle. If a local frame is chosen (a local basis of sections), then this covariant derivative is represented by 
the connection form A, a Lie algebra- valued 1-form which is called the gauge potential in physics. This is evidently 
not an intrinsic but a frame-dependent quantity. The curvature form F is constructed from a connection form, a Lie 
algebra- valued 2-form which is an intrinsic quantity, by 

F = d A + A A A 



Nonabelian Gauge Theory 



284 



where d stands for the exterior derivative and A stands for the wedge product. ( A is an element of the vector space 
spanned by the generators y a '» an d so the components of A do not commute with one another. Hence the wedge 
product A A A does not vanish.) 

Infinitesimal gauge transformations form a Lie algebra, which is characterized by a smooth Lie algebra valued 
scalar, 8. Under such an infinitesimal gauge transformation, 



Also, 6 £ F = eF, which means Ftransforms covariantly. 

Not all gauge transformations can be generated by infinitesimal gauge transformations in general. An example is 
when the base manifold is a compact manifold without boundary such that the homotopy class of mappings from that 
manifold to the Lie group is nontrivial. See instanton for an example. 

The Yang— Mills action is now given by 



where * stands for the Hodge dual and the integral is defined as in differential geometry. 

A quantity which is gauge-invariant i.e. invariant under gauge transformations is the Wilson loop, which is defined 
over any closed path, y, as follows: 



where % is the character of a complex representation p and *p represents the path-ordered operator. 

Quantization of gauge theories 

Gauge theories may be quantized by specialization of methods which are applicable to any quantum field theory. 
However, because of the subtleties imposed by the gauge constraints (see section on Mathematical formalism, 
above) there are many technical problems to be solved which do not arise in other field theories. At the same time, 
the richer structure of gauge theories allow simplification of some computations: for example Ward identities 
connect different renormalization constants. 

Methods and aims 

The first gauge theory to be quantized was quantum electrodynamics (QED). The first methods developed for this 
involved gauge fixing and then applying canonical quantization. The Gupta-Bleuler method was also developed to 
handle this problem. Non-abelian gauge theories are now handled by a variety of means. Methods for quantization 
are covered in the article on quantization. 

The main point to quantization is to be able to compute quantum amplitudes for various processes allowed by the 
theory. Technically, they reduce to the computations of certain correlation functions in the vacuum state. This 
involves a renormalization of the theory. 

When the running coupling of the theory is small enough, then all required quantities may be computed in 
perturbation theory. Quantization schemes intended to simplify such computations (such as canonical quantization) 
may be called perturbative quantization schemes. At present some of these methods lead to the most precise 
experimental tests of gauge theories. 

However, in most gauge theories, there are many interesting questions which are non-perturbative. Quantization 
schemes suited to these problems (such as lattice gauge theory) may be called non-perturbative quantization 



S £ A = [e, A] — de 

where -]is the Lie bracket. 



One nice thing is that if 6 £ X = eX , then 8 £ DX = sDX where D is the co variant derivative 

DX = dX + AX. 





Nonabelian Gauge Theory 



285 



schemes. Precise computations in such schemes often require supercomputing, and are therefore less well-developed 
currently than other schemes. 

Anomalies 

Some of the symmetries of the classical theory are then seen not to hold in the quantum theory — a phenomenon 
called an anomaly. Among the most well known are: 

• The scale anomaly, which gives rise to a running coupling constant. In QED this gives rise to the phenomenon of 
the Landau pole. In Quantum Chromodynamics (QCD) this leads to asymptotic freedom. 

• The chiral anomaly in either chiral or vector field theories with fermions. This has close connection with topology 
through the notion of instantons. In QCD this anomaly causes the decay of a pion to two photons. 

• The gauge anomaly, which must cancel in any consistent physical theory. In the electroweak theory this 
cancellation requires an equal number of quarks and leptons. 

Pure gauge 

A pure gauge is the set of field configurations obtained by a gauge transformation on the null field configuration. So 
it is a particular "gauge orbit" in the field configuration's space. 

In the abelian case, where A^[x) — v A f ^{x) = A^{po) + d^f{x) , the pure gauge is the set of field 
configurations A'^x) = d fl f(x) for all f(x). 

Bibliography 

General readers: 

• Schumm, Bruce (2004) Deep Down Things. Johns Hopkins University Press. Esp. chpt. 8. A serious attempt by a 
physicist to explain gauge theory and the Standard Model with little formal mathematics. 

Texts: 

• Bromley, D.A. (2000). Gauge Theory of Weak Interactions. Springer. ISBN 3-540-67672-4. 

• Cheng, T.-P.; Li, L.-F. (1983). Gauge Theory of Elementary Particle Physics. Oxford University Press. 
ISBN 0-19-851961-3. 

• Frampton, P. (2008). Gauge Field Theories (3rd ed.). Wiley-VCH. 

• Kane, G.L. (1987). Modern Elementary Particle Physics. Perseus Books. ISBN 0-201-1 1749-5. 
Articles: 

• Becchi, C. (1997). Introduction to Gauge Theories ^\ 

• Gross, D. (1992). "Gauge theory - Past, Present and Future" [5] . Retrieved 2009-04-23. 

• Jackson, J.D. (2002). "From Lorenz to Coulomb and other explicit gauge transformations" Am.J.Phys 70: 
917-928. doi:10.1119/l. 1491265. 

• Svetlichny, George (1999). Preparation for Gauge Theory \ 



Nonabelian Gauge Theory 



286 



External links 

• Yang-Mills equations on Dispersive Wiki 

rm 

• Gauge theories on Scholarpedia 



References 

[1] Wolfgang Pauli (1941) " Relativistic Field Theories of Elementary Particles, (http://prola.aps.org/abstract/RMP/vl3/i3/p203_l)" Rev. 
Mod. Phys. 13: 203-32. 

[2] Pickering, A. (1984). Constructing Quarks (http://www.amazon.conVConstructing-Quarks-Sociological-History-Particle/dp/0226667995/ 

ref=pd_bbs_sr_l?ie=UTF8&s=books&qid=1235837296&sr=8-l). University of Chicago Press. ISBN 0226667995. . 
[3] Sakurai, Advanced Quantum Mechanics, sect 1-4 
[4] http://arxiv.org/abs/hep-ph/970521 1 
[5 ] http : //p sroc . phy s . ntu . edu . tw/ cjp/v3 0/95 5 . pdf 
[6] http://arxiv.org/abs/physics/0204034 
[7] http://arxiv.org/abs/math-ph/9902027 

[8] http : //tosio . math . toronto . edu/ wiki/ index . php/ Yang-Mills_equations 
[9] http : / / www . scholarpedia. org/ article/ Gauge_theories 



List of quantum field theories 

List of quantum field theories: 

• Chern-Simons model 

• Chiral model 

• Gross-Neveu 

• Kondo model 

• Lower dimensional quantum field theory 

• Minimal model 

• Nambu-Jona-Lasinio 

• Noncommutative quantum field theory 

• Nonlinear sigma model 

• Phi to the fourth 

• Quantum chromodynamics 

• Quantum electrodynamics 

• Quantum flavordynamics 

• Quantum Yang-Mills theory 

• Schwinger model 

• Sine-Gordon 

• Standard model 

• String Theory 

• Thirring model 

• Toda field theory 

• Topological quantum field theory 

• Wess-Zumino model 

• Wess-Zumino-Witten model 

• Yang-Mills 

• Yang-Mills-Higgs model 

• Yukawa model 



Noncommutative geometry 



287 



Noncommutative geometry 

Noncommutative geometry (NCG), is a branch of mathematics concerned with geometric approach to 
noncommutative algebras, and with construction of spaces which are locally presented by noncommutative algebras 
of functions (possibly in some generalized sense). A noncommutative algebra is here an associative algebra in which 
the multiplication is not commutative, that is, for which xy does not always equal yx\ or more generally an algebraic 
structure in which one of the principal binary operations is not commutative; one also allows additional structures, 
e.g. topology or norm to be possibly carried by the noncommutative algebra of functions. The leading direction in 
noncommutative geometry has been laid by French mathematician Alain Connes since his involvement from about 
1979. 

Motivation 

Main motivation is to extend the commutative duality between spaces and functions to the noncommutative setting. 
In mathematics, there is a close relationship between spaces, which are geometric in nature, and the numerical 
functions on them. In general, such functions will form a commutative ring. For instance, one may take the ring C(X) 
of continuous complex- valued functions on a topological space X. In many important cases (e.g., if X is a compact 
Hausdorff space), we can recover X from C(X), and therefore it makes some sense to say that X has commutative 
geometry. 

More specifically, in topology, compact Hausdorff topological spaces can be reconstructed from the Banach algebra 
of functions on the space (Gel'fand-Neimark). In commutative algebraic geometry, algebraic schemes are locally 
prime spectra of commutative unital rings (A. Grothendieck), and schemes can be reconstructed from the categories 
of quasicoherent sheaves of modules on them (P. Gabriel-A. Rosenberg). For Grothendieck topologies, the 
cohomological properties of a site are invariant of the corresponding category of sheaves of sets viewed abstractly as 
a topos (A. Grothendieck). In all these cases, a space is reconstructed from the algebra of functions or its categorified 
version — some category of sheaves on that space. 

Functions on a topological space can be multiplied and added pointwise hence they form a commutative algebra; in 
fact these operations are local in the topology of the base space, hence the functions form a sheaf of commutative 
rings over the base space. 

The dream of noncommutative geometry is to generalize this duality to the duality between 

• noncommutative algebras, or sheaves of noncommutative algebras, or sheaf-like noncommutative algebraic or 
operator-algebraic structures 

• and geometric entities of certain kind, 

and interact between the algebraic and geometric description of those via this duality. 

Regarding that the commutative rings correspond to usual affine schemes, and commutative C*-algebras to usual 
topological spaces, the extension to noncommutative rings and algebras requires non-trivial generalization of 
topological spaces, as "non-commutative spaces". For this reason, some talk about non-commutative topology, 
though the term has also other meanings. 



Noncommutative geometry 



288 



Applications in mathematical physics 

Some applications in particle physics are described on the entries Noncommutative standard model and 
Noncommutative quantum field theory. Sudden rise in interest in noncommutative geometry in physics, follows after 
the speculations of its role in M-theory made in 1997^ . 

Motivation from ergodic theory 

Some of the theory developed by Alain Connes to handle noncommutative geometry at a technical level has roots in 
older attempts, in particular in ergodic theory. The proposal of George Mackey to create a virtual subgroup theory, 
with respect to which ergodic group actions would become homogeneous spaces of an extended kind, has by now 
been subsumed. 

Non-commutative C* -algebras, von Neumann algebras 

(The formal duals of) non-commutative C*-algebras are often now called non-commutative spaces. This is by 
analogy with the Gelfand representation, which shows that commutative C* -algebras are dual to locally compact 
Hausdorff spaces. In general, one can associate to any C*-algebra S a topological space S; see spectrum of a 
C*-algebra. 

For the duality between a-finite measure spaces and commutative von Neumann algebras, noncommutative von 
Neumann algebras are called non-commutative measure spaces. 

Non-commutative differentiable manifolds 

A smooth Riemannian manifold M is a topological space with a lot of extra structure. From its algebra of continuous 
functions C(M) we only recover M topologically. The algebraic invariant that recovers the Riemannian structure is a 
spectral triple. It is constructed from a smooth vector bundle E over M, e.g. the exterior algebra bundle. The Hilbert 
space L2(M,E) of square integrable sections of E carries a representation of C(M) by multiplication operators, and we 
consider an unbounded operator D in L^M^) with compact resolvent (e.g. the signature operator), such that the 
commutators [D,f] are bounded whenever / is smooth. A recent deep theorem states that M as a Riemannian 
manifold can be recovered from this data. 

This suggests that one might define a noncommutative Riemannian manifold as a spectral triple (A,H,D), consisting 
of a representation of a C*-algebra A on a Hilbert space H, together with an unbounded operator D on H, with 
compact resolvent, such that [D,a] is bounded for all a in some dense subalgebra of A. Research in spectral triples is 
very active, and many examples of noncommutative manifolds have been constructed. 

Non-commutative affine and projective schemes 

In analogy to the duality between affine schemes and commutative rings, we define a category of noncommutative 
affine schemes as the dual of the category of associative unital rings. There are certain analogues of Zariski topology 
in that context so that one can glue such affine schemes to more general objects. 

There are also generalizations of the Cone and of the Proj of a commutative graded ring, mimicking a Serre's 
theorem on Proj. Namely the category of quasicoherent sheaves of O-modules on a Proj of a commutative graded 
algebra is equivalent to the category of graded modules over the ring localized on Serre's subcategory of graded 
modules of finite length; there is also analogous theorem for coherent sheaves when the algebra is Noetherian. This 
theorem is extended as a definition of noncommutative projective geometry by Michael Artin and J. J. Zhang , 
who add also some general ring-theoretic conditions (e.g. Artin- Schelter regularity). 

Many properties of projective schemes extend to this context. For example, there exist an analog of the celebrated 
Serre duality for noncommutative projective schemes of Artin and Zhang . 



Noncommutative geometry 



289 



A. L. Rosenberg has created a rather general relative concept of noncommutative quasicompact scheme (over a 

base category), abstracting the Grothendieck's study of morphisms of schemes and covers in terms of categories of 

Mi 

quasicoherent sheaves and flat localization functors . There is also another interesting approach via localization 
theory, due to Fred Van Oystaeyen, Luc Willaert and Alain Verschoeren, where the main concept is that of a 
schematic algebra 1 ^ . 

Invariants for noncommutative spaces 

Some of the motivating questions of the theory are concerned with extending known topological invariants to formal 
duals of noncommutative (operator) algebras and other replacements and candidates for noncommutative spaces. 
One of the main starting points of the Alain Connes' direction in noncommutative geometry is his spectacular 
discovery (and independently by Boris Tsygan) of a very important new homology theory associated to 
noncommutative associative algebras and noncommutative operator algebras, namely the cyclic homology and its 
relations to the algebraic K-theory (primarily via Connes-Chern character map). 

The theory of characteristic classes of smooth manifolds has been extended to spectral triples, employing the tools of 
operator K-theory and cyclic cohomology. Several generalizations of now classical index theorems allow for 
effective extraction of numerical invariants from spectral triples. The fundamental characteristic class in cyclic 
cohomology, the JLO cocycle, generalizes the classical Chern character. 

Examples of non-commutative spaces 

• In Weyl quantization, the symplectic phase space of classical mechanics is deformed into a non-commutative 
phase space generated by the position and momentum operators. 

• The standard model of particle physics is another example of a noncommutative geometry, cf noncommutative 
standard model. 

• The noncommutative torus, deformation of the function algebra of the ordinary torus, can be given the structure 
of a spectral triple. This class of examples has been studied intensively and still functions as a test case for more 
complicated situations. 

• Snyder space ^ 

• Noncommutative algebras arising from foliations. 

• Examples related to dynamical systems arising from number theory, such as the Gauss shift on continued 
fractions, give rise to noncommutative algebras that appear to have interesting noncommutative geometries. 

Notes 

[1] Alain Connes, Michael R. Douglas, Albert Schwarz, Noncommutative geometry and matrix theory: compactification on tori. J. High Energy 
Phys. 1998, no. 2, Paper 3, 35 pp. doi (http://dx.doi.org/10.1088/1126-6708/1998/02/003), hep-th/9711162 (http://arxiv.org/abs/ 
hep-th/9711162) 

[2] M. Artin, J. J. Zhang, Noncommutative projective schemes, Adv. Math. 109 (1994), no. 2, 228-287, doi (http://dx.doi.org/10.1006/aima. 
1994.1087) 

[3] Amnon Yekutieli, James J. Zhang, Serre duality for noncommutative projective schemes, Proc. Amer. Math. Soc. 125, n. 3, 1997, 697-707, 

pdf (https://www.ams.org/proc/1997-125-03/S0002-9939-97-03782-9/S0002-9939-97-03782-9.pdf) 
[4] A. L. Rosenberg, Noncommutative schemes, Compositio Math. 112 (1998) 93-125, doi (http://dx.doi.Org/10.1023/A:1000479824211); 

Underlying spaces of noncommutative schemes, preprint MPIM2003-111, dvi (http://www.mpim-bonn.mpg.de/preprints/send?bid=1947), 

ps (http://www.mpim-bonn.mpg. de/preprints/send?bid=1948); MSRI lecture Noncommutative schemes and spaces (Feb 2000): video 

(http :// www . msri . org/ publications/In/ msri/ 2000/ interact/ ro senberg/ 1 / index . html) 
[5] Freddy van Oystaeyen, Algebraic geometry for associative algebras, ISBN 0-8247-0424-X - New York: Dekker, 2000.- 287 p. - (Monographs 

and textbooks in pure and applied mathematics , 232); F. van Oystaeyen, L. Willaert, Grothendieck topology, coherent sheaves and Serre's 

theorem for schematic algebras, J. Pure Appl. Alg. 104 (1995), p. 109-122 
[6] H. S. Snyder, Quantized Space-Time, Phys. Rev. 71 (1947) 38 



Noncommutative geometry 



290 



References 

• Connes, Alain (1994), Non-commutative geometry (http://www.alainconnes.org/docs/book94bigpdf.pdf), 
Boston, MA: Academic Press, ISBN 978-0-12-185860-5 

• Connes, Alain; Marcolli, Matilde (2008), "A walk in the noncommutative garden" (http://arxiv.org/abs/math/ 
0601054), An invitation to noncommutative geometry, World Sci. Publ., Hackensack, NJ, pp. 1-128, MR2408150 

• Connes, Alain; Marcolli, Matilde (2008), Noncommutative geometry, quantum fields and motives (http://www. 
alainconnes.org/docs/bookwebfinal.pdf), American Mathematical Society Colloquium Publications, 55, 
Providence, R.I.: American Mathematical Society, MR2371808, ISBN 978-0-8218-4210-2 

• Gracia-Bondia, Jose M; Figueroa, Hector; Varilly, Joseph C (2000), Elements of N on- commutative geometry, 
Birkhauser, ISBN 978-0817641245 

• Landi, Giovanni (1997), An introduction to noncommutative spaces and their geometries (http://arxiv.org/abs/ 
hep-th/9701078), Lecture Notes in Physics. New Series m: Monographs, 51, Berlin, New York: Springer- Verlag, 
MR1482228, ISBN 978-3-540-63509-3 

• Van Oystaeyen, Fred; Verschoren, Alain (1981), Non-commutative algebraic geometry, Lecture Notes in 
Mathematics, 887, Springer- Verlag, ISBN 978-3540111535 

External links 

• Introduction to Quantum Geometry (http://www.matem.unam.mx/~micho/papers/qgeom.pdf) by Micho 
Durdevich 

• Lectures on Noncommutative Geometry (http://arxiv.org/abs/math/0506603) by Victor Ginzburg 

• Very Basic Noncommutative Geometry (http://arxiv.org/abs/math/0408416) by Masoud Khalkhali 

• Lectures on Arithmetic Noncommutative Geometry (http://arxiv.org/abs/math.qa/0409520) by Matilde 
Marcolli 

• Noncommutative Geometry for Pedestrians (http://arxiv.org/abs/gr-qc/9906059) by J. Madore 

• An informal introduction to the ideas and concepts of noncommutative geometry (http://arxiv.org/abs/math-ph/ 
0612012) by Thierry Mas son (an easier introduction that is still rather technical) 

• Noncommutative geometry on arxiv.org (http://xstructure.inr.ac.ru/x-bin/subthemes3.py ?level=2& 
indexl=-173391&skip=0) 

• MathOverflow, Theories of Noncommutative Geometry (http://mathoverflow.net/questions/10512/ 
theories-of-noncommutative-geometry) 

• S. Mahanta, On some approaches towards non-commutative algebraic geometry, math.QA/0501166 (http:// 
arxiv.org/abs/math/0501 166) 



Quantum gravity 



291 



Quantum gravity 

Quantum gravity (QG) is the field of theoretical physics attempting to unify quantum mechanics with general 
relativity in a self-consistent manner, or more precisely, to formulate a self-consistent theory which reduces to 

2 

ordinary quantum mechanics in the limit of weak gravity (potentials much less than c ) and which reduces to 
Einsteinian general relativity in the limit of large actions (action much larger than reduced Planck's constant). The 
theory must be able to predict the outcome of situations where both quantum effects and strong-field gravity are 
important (at the Planck scale, unless large extra dimension conjectures are correct). Motivation for quantizing 
gravity comes from the remarkable success of the quantum theories of the other three fundamental interactions. 
Although some quantum gravity theories such as string theory and other so-called theories of everything attempt to 
unify gravity with the other fundamental forces, others such as loop quantum gravity make no such attempt; they 
simply quantize the gravitational field while keeping it separate from the other forces. 

Observed physical phenomena in the early 21st century can be described well by quantum mechanics or general 
relativity, without needing both. This can be thought of as due to an extreme separation of mass scales at which they 
are important. Quantum effects are usually important only for the "very small", that is, for objects no larger than 
typical molecules. General relativistic effects, on the other hand, show up only for the "very large" bodies such as 
collapsed stars. (Planets' gravitational fields, as of 2009, are well-described by linearized gravity; so strong-field 

2 

effects — any effects of gravity beyond lowest nonvanishing order in cp/c — have not been observed even in the 

gravitational fields of planets and main sequence stars). There is a lack of experimental evidence relating to quantum 

gravity and classical physics adequately describes the observed effects of gravity over a range of 50 orders of 

-23 30 

magnitude of mass, i.e. for masses of objects from about 10 to 10 kg. 

Overview 

Much of the difficulty in meshing these theories at all energy scales comes from the different assumptions that these 
theories make on how the universe works. Quantum field theory depends on particle fields embedded in the flat 
space-time of special relativity. General relativity models gravity as a curvature within space-time that changes as a 
gravitational mass moves. Historically, the most obvious way of combining the two (such as treating gravity as 
simply another particle field) ran quickly into what is known as the renormalization problem. In the old-fashioned 
understanding of renormalization, gravity particles would attract each other and adding together all of the 
interactions results in many infinite values which cannot easily be cancelled out mathematically to yield sensible, 
finite results. This is in contrast with quantum electrodynamics where, while the series still do not converge, the 
interactions sometimes evaluate to infinite results, but those are few enough in number to be removable via 
renormalization. 

Effective field theories 

Quantum gravity can be treated as an effective field theory. Effective quantum field theories come with some 
high-energy cutoff, beyond which we do not expect that the theory provides a good description of nature. The 
"infinities" then become large but finite quantities proportional to this finite cutoff scale, and correspond to processes 
that involve very high energies near the fundamental cutoff. These quantities can then be absorbed into an infinite 
collection of coupling constants, and at energies well below the fundamental cutoff of the theory, to any desired 
precision; only a finite number of these coupling constants need to be measured in order to make legitimate 
quantum-mechanical predictions. This same logic works just as well for the highly successful theory of low-energy 
pions as for quantum gravity. Indeed, the first quantum-mechanical corrections to graviton- scattering and Newton's 
law of gravitation have been explicitly computed^ (although they are so astronomically small that we may never be 
able to measure them). In fact, gravity is in many ways a much better quantum field theory than the Standard Model, 
since it appears to be valid all the way up to its cutoff at the Planck scale. (By comparison, the Standard Model is 



Quantum gravity 



292 



expected to start to break down above its cutoff at the much smaller scale of around 1000 GeV.) 

While confirming that quantum mechanics and gravity are indeed consistent at reasonable energies, it is clear that 
near or above the fundamental cutoff of our effective quantum theory of gravity (the cutoff is generally assumed to 
be of order the Planck scale), a new model of nature will be needed. Specifically, the problem of combining quantum 
mechanics and gravity becomes an issue only at very high energies, and may well require a totally new kind of 
model. 

Quantum gravity theory for the highest energy scales 

The general approach to deriving a quantum gravity theory that is valid at even the highest energy scales is to 
assume that such a theory will be simple and elegant and, accordingly, to study symmetries and other clues offered 
by current theories that might suggest ways to combine them into a comprehensive, unified theory. One problem 
with this approach is that it is unknown whether quantum gravity will actually conform to a simple and elegant 
theory, as it should resolve the dual conundrums of special relativity with regard to the uniformity of acceleration 
and gravity, and general relativity with regard to spacetime curvature. 

Such a theory is required in order to understand problems involving the combination of very high energy and very 
small dimensions of space, such as the behavior of black holes, and the origin of the universe. 

Quantum mechanics and general relativity 



The graviton 

At present, one of the deepest problems in theoretical physics is harmonizing 
the theory of general relativity, which describes gravitation, and applies to 
large-scale structures (stars, planets, galaxies), with quantum mechanics, 
which describes the other three fundamental forces acting on the atomic scale. 
This problem must be put in the proper context, however. In particular, 
contrary to the popular claim that quantum mechanics and general relativity 
are fundamentally incompatible, one can demonstrate that the structure of 
general relativity essentially follows inevitably from the quantum mechanics 
of interacting theoretical spin-2 massless particles ^ ^ ^ ^ ^ (called 
gravitons). 

While there is no concrete proof of the existence of gravitons, quantized 
theories of matter may necessitate their existence. Supporting this theory is 
the observation that all other fundamental forces have one or more messenger 
particles, except gravity, leading researchers to believe that at least one most 
likely does exist; they have dubbed these hypothetical particles gravitons. 
Many of the accepted notions of a unified theory of physics since the 1970s, 
including string theory, superstring theory, M-theory, loop quantum gravity, 
all assume, and to some degree depend upon, the existence of the graviton. 
Many researchers view the detection of the graviton as vital to validating their 
work. 




Gravity Probe B (GP-B) has measured 
spacetime curvature near Earth to test 

related models in application of 
Einstein's general theory of relativity. 



Quantum gravity 



293 



The dilaton 

The dilaton made its first appearance in Kaluza-Klein theory, a five-dimensional theory that combined gravitation 
and electromagnetism. Generally, it appears in string theory. More recently, it has appeared in the lower-dimensional 
many-bodied gravity problem 1 based on the field theoretic approach of Roman Jackiw. The impetus arose from the 
fact that complete analytical solutions for the metric of a covariant Af-body system have proven elusive in General 
Relativity. To simplify the problem, the number of dimensions was lowered to (1+1) namely one spatial dimension 

roi 

and one temporal dimension. This model problem, known as R=T theory (as opposed to the general G=T theory) 
was amenable to exact solutions in terms of a generalization of the Lambert W function. It was also found that the 
field equation governing the dilaton (derived from differential geometry) was none other than the Schrodinger 

rm 

equation and consequently amenable to quantization. 1 Thus, one had a theory which combined gravity, quantization 
and even the electromagnetic interaction, promising ingredients of a fundamental physical theory. It is worth noting 
that the outcome revealed a previously unknown and already existing natural link between general relativity and 
quantum mechanics. However, this theory needs to be generalized in (2+1) or (3+1 ) dimensions although, in 
principle, the field equations are amenable to such generalization. It is not yet clear what field equation will govern 
the dilaton in higher dimensions. This is further complicated by the fact that gravitons can propagate in (3+1) 
dimensions and consequently that would imply gravitons and dilatons exist in the real world. Moreover, detection of 
the dilaton is expected to be even more elusive than the graviton. However, since this approach allows for the 
combination of gravitational, electromagnetic and quantum effects, their coupling could potentially lead to a means 
of vindicating the theory, through cosmology and perhaps even experimentally. 

Nonrenormalizability of gravity 

General relativity, like electromagnetism, is a classical field theory. One might expect that, as with 
electromagnetism, there should be a corresponding quantum field theory. 

However, gravity is nonrenormalizableJ 10 ^ For a quantum field theory to be well-defined according to this 
understanding of the subject, it must be asymptotically free or asymptotically safe. The theory must be characterized 
by a choice of finitely many parameters, which could, in principle, be set by experiment. For example, in quantum 
electrodynamics, these parameters are the charge and mass of the electron, as measured at a particular energy scale. 

On the other hand, in quantizing gravity, there are infinitely many independent parameters needed to define the 
theory. For a given choice of those parameters, one could make sense of the theory, but since we can never do 
infinitely many experiments to fix the values of every parameter, we do not have a meaningful physical theory: 

• At low energies, the logic of the renormalization group tells us that, despite the unknown choices of these 
infinitely many parameters, quantum gravity